RMoodsTeam / RMoods

Web app for NLP analysis of Reddit
https://rmoodsteam.github.io/RMoods/
GNU General Public License v3.0
4 stars 0 forks source link

Store and load pre-trained NLP models #69

Closed m-milek closed 1 month ago

m-milek commented 3 months ago

Figure out a way to conveniently store, load and share pre-trained NLP models.

m-milek commented 2 months ago

So, how is it coming along?

Smixie commented 2 months ago

Recently after talk with friend he told me about Github LFS. It will allow us to store files up to 5GB in our repository. The configuration looks more than easy.

Libriaries: I read some article and most of them pointed to NLTK, spaCy, TextBlob, Hugging Face Transformers. But about that i need to dive in much more. What did you used during classes? Maybe it will be a good palce to start.

About formats used I check files extensions on Hugging Face and people used mostly h5 and sometimes json and bin.

What do you think about that?

m-milek commented 2 months ago

I think during classes we've used Scikit. h5 is something I've also came across as a first choice when doing some googling. Sounds good!

When it comes to the file storage, I can see that GitHub LFS has some pretty strict bandwidth limitations that we could easily exceed (1GiB). I propose a solution:

What do you think?