Open jacomoel opened 1 week ago
Because of the high dimensionality of the embeddings, we think the best option from a performance and storage standpoint is to just perform the embedding again within the Randomforest classificer or in latter_run(). With this solution you will not have to save the embeddings in the database and retrieve them from the database before every latter_run().
We need to extract the embedding of a pass1 run before the reduction steps that creates the db nodes that we can view in the frontend. To pass "a copy" of the original pass1 embeddings to the classifier component.
Thus we will be able to use the Random Forest classifier that is empowered by the user_input feedback and the original embeddings.