recommeddit / labs

ML/data experiments for Recommeddit
MIT License
1 stars 0 forks source link

Update and Create a Labeled dataset for Movie NER. #2

Closed Gopu2001 closed 2 years ago

Gopu2001 commented 2 years ago

Using Reddit's PushShift API Found a couple of good sources of data (comments inside a subreddit) to practice with Considering using different sets of comments for the final dataset

Gopu2001 commented 2 years ago

This section has not yet been completed (I'd say about 80% done). Still in progress. Database end changed to PostgreSQL with the idea of using SQL's substring method and measuring its speed for populating the database. affix.tree from last year was found to be far too slow and heavy to load for the size of our movie database.