Closed corynissen closed 11 years ago
Sorry, I was under the impression that the Mongo table I was looking at was food poison tweets only:
https://github.com/smartchicago/foodborne/blob/master/lib/tasks/fetch.rake#L25
What property should we be look at to filter out non-food-poisoning tweets?
Scott - yes the table (fp_tweets_chi_01) is only food poisoning tweets. However, it is both the ones R says are valid, and the ones R says are invalid. Both cases.
To filter, check out the rClass field:
db.fp_tweets_chi_01.find({rClass:"food poisoning tweet\n"}) or db.fp_tweets_chi_01.find({rClass:"not a food poisoning tweet\n"}) or db.fp_tweets_chi_01.find({rClass:"don't know..\n"})
(Don't blame me for the status texts, they come from Cory)
Shit, I should have noticed this earlier. I just figured the classifier sucked, I didn't know we weren't using it at all.
The classifier now filters out any tweets with an rClass of "not a food poisoning tweet\n".
I don't think it's a classifier or mongo issue. I worked with Joe to look at this tweet...
tweet.id_str:"334866148690632704" RT @ChurrosNRainbws: I have food poisoning &My dad took apples and caramel ( my favorite) and started rubbing it in my face saying "haha look what I got" -.- by user @Stephanaaayyy
It is labeled as "not a food poisoning tweet" in mongo. If I cut and paste that text into my classifier service, it is also labeled as "not a food poisoning tweet".