DisasterMasters / TweetAnalysis

Repository for storing the code used to analyse the tweets collected from the Twitter scraper
2 stars 3 forks source link

Relevant tweets #5

Open audrism opened 5 years ago

abhidya commented 5 years ago

https://github.com/DisasterMasters/TweetAnalysis/blob/master/src/results/Relevance%20Preprocessing.ipynb Best Text Preprocessing for Doc2vec is simply distributed bag of words + punctuation removal Tried combos of distributed memory distributed bag of words LowerCase Removal of Stop Words Rare words removal Spelling correction punctuation removal

audrism commented 5 years ago

@abhidya what are the datasets you train relevant/irrelevant tweets for irma? Also is the code link above the right one. @nwest13