DSGT-DLP / Deep-Learning-Playground

Web Application where people new to Deep Learning can input a dataset and toy around with basic Pytorch modules without writing any code
MIT License
26 stars 8 forks source link

Integrate BERT/DistilBERT + Word2Vec into DLP for NLP tasks #182

Open karkir0003 opened 2 years ago

karkir0003 commented 2 years ago

Describe the solution you'd like Make it easy for people to play with BERT, DistilBERT through textual datasets they upload (eg: twitter tweets). BERT is a HUGE model, but DistilBERT is a "less heavier version". This issue will/might require some decent computing power

Word2Vec is another good model to integrate into NLP for processing textual data (users might not have to drag it in or maybe. we can talk about that detail over discord). See if we can come up with a strategy to test possibility of including word2vec?

karkir0003 commented 2 years ago

@avayedawadi , what's your progress on this issue? SpaCy, Glove, gensim might be good starting points to do word2vec and they're widely used

avayedawadi commented 2 years ago

Got busy today but read around about the best solution. Seems that using the Gensim library will be the most effective in Flask because of its relative speed and the fact that others on the internet have implemented Word2Vec in Flask APIs by using Gensim.

karkir0003 commented 2 years ago

cool. Hopefully, gensim isn't too slow for us b/c socket architecture and stuff is coming in @avayedawadi

avayedawadi commented 2 years ago

https://gist.github.com/duydao/e994f7fbdb21a5122488ec00386d8083 Gives an idea of how to use Word2Vec in Flask.

karkir0003 commented 2 years ago

This library looks interesting: https://github.com/utterworks/fast-bert

Also, fastai seems to have BERT models supported

karkir0003 commented 2 years ago

Note that this example is for sentiment analysis problems (like given text, predict positive, negative, moderate sentiment). I think sentiment analysis is a good thing to support for people new to NLP, DL, ML. It's a pretty understandable concept