Get Entity Embeddings - Githubissues

AxeldeRomblay / MLBox

MLBox is a powerful Automated Machine Learning python library.

https://mlbox.readthedocs.io/en/latest/

Other

1.49k stars 274 forks source link

Get Entity Embeddings #44

Closed JoshuaC3 closed 7 years ago

JoshuaC3 commented 7 years ago

Hi, neat package, just getting my teeth into it.

One thing that stands out is that I cannot extract the entity embeddings. They seem to work really well so naturally, I want to plot them, explore them, tweak them etc... Is it possible to do this? Many thanks!

AxeldeRomblay commented 7 years ago

Hi,

Yes of course you can :) Just call Categorical_encoder(strategy="entity_embedding").fit_transform(df_train, y_train). Then you can do dimensionality reduction (t-SNE, PCA, ...) and plot !

Here is the associated doc : http://mlbox.readthedocs.io/en/latest/features.html#categorical-features

JoshuaC3 commented 7 years ago

Hi Axel, sorry, I wasn't clear with my question. What about once the model has been trained? Is the best way to do this by getting out the best model params and inserting into categorical_encoder? Or is there a method call that can be used to get the embedding? I'm guessing the first option.

AxeldeRomblay commented 7 years ago

Ok now I understand your question. No at the moment it is not possible to get the pipeline fitted after calling fit_predict(). But I will add this feature if you want ! It is a good idea !

Nevertheless, you can insert the best parameters into the pipeline NA_encoder() + Categorical_encoder() and fit again (make sure the configuration to impute missing values is exactly the same so that the pipeline will be reloaded from the disk !!)

Thank you for this issue

JoshuaC3 commented 7 years ago

Thanks, I get how a but better how it is working now. Great package BTW.