JasonKessler / scattertext

Beautiful visualizations of how language differs among document types.
Apache License 2.0
2.23k stars 287 forks source link

<Question> How can we use embedding of conceptnet numberbatch here? #66

Closed vijender412 closed 3 years ago

vijender412 commented 3 years ago

First of all, thanks for sharing the one of the greatest open sourced work. I have gone through the package and found very useful. Going one step more I wanted to try conceptnet numberbatch or bert embedding in this. Can you suggest any way. Thanks

JasonKessler commented 3 years ago

Right now, the means of visualizing scatterplots of arbitrary term or document embeddings projected into 2d is a little janky. I'd suggest taking a look at demo_embeddings_pca.py and tracing through that code.

It shows how to project sparse tfidf embeddings of the document/term matrix into two dimensions using SVD. This could easily be applied to any document or term embeddings, and projected into 2d using your favorite algorithm.

vijender412 commented 3 years ago

Thanks alot @JasonKessler . I would like to take time and try some of the embedding and will get back to you on this thread.

JasonKessler commented 3 years ago

Best of luck!