smartchicago / foodborne

Manages tweets about food poisoning in Chicago and presents Open311 form for reporting incidents.
http://foodborne.smartchicagoapps.org/
29 stars 11 forks source link

"not a food poisoning tweet" showing up in foodborne feed #32

Closed corynissen closed 11 years ago

corynissen commented 11 years ago

I don't think it's a classifier or mongo issue. I worked with Joe to look at this tweet...

tweet.id_str:"334866148690632704" RT @ChurrosNRainbws: I have food poisoning &My dad took apples and caramel ( my favorite) and started rubbing it in my face saying "haha look what I got" -.- by user @Stephanaaayyy

It is labeled as "not a food poisoning tweet" in mongo. If I cut and paste that text into my classifier service, it is also labeled as "not a food poisoning tweet".

srobbin commented 11 years ago

Sorry, I was under the impression that the Mongo table I was looking at was food poison tweets only:

https://github.com/smartchicago/foodborne/blob/master/lib/tasks/fetch.rake#L25

What property should we be look at to filter out non-food-poisoning tweets?

jolson7168 commented 11 years ago

Scott - yes the table (fp_tweets_chi_01) is only food poisoning tweets. However, it is both the ones R says are valid, and the ones R says are invalid. Both cases.

To filter, check out the rClass field:

db.fp_tweets_chi_01.find({rClass:"food poisoning tweet\n"}) or db.fp_tweets_chi_01.find({rClass:"not a food poisoning tweet\n"}) or db.fp_tweets_chi_01.find({rClass:"don't know..\n"})

(Don't blame me for the status texts, they come from Cory)

corynissen commented 11 years ago

Shit, I should have noticed this earlier. I just figured the classifier sucked, I didn't know we weren't using it at all.

srobbin commented 11 years ago

The classifier now filters out any tweets with an rClass of "not a food poisoning tweet\n".