smadha / SarcasmDetector

CSCI-544 Final Project
Apache License 2.0
9 stars 6 forks source link

Improving processing of words #16

Closed TheSidhesh closed 8 years ago

TheSidhesh commented 8 years ago

I tested the data on different small sentences taken off the corpus and realised that a few cases need to be considered. We should discuss it when we meet next

swanandj7 commented 8 years ago

Lets decide on the amount of tweets ASAP. I think 5k sarcastic and similar for non-sarcastic would be good @smadha @TheSidhesh @RajviM

TheSidhesh commented 8 years ago

Sounds good to me!!

RajviM commented 8 years ago

Yep. Sounds right. We can ask prof on Monday and change it if necessary

On Thursday, April 14, 2016, Sidhesh Badrinarayan notifications@github.com wrote:

Sounds good to me!!

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_smadha_Team-2DMissionNLP_issues_16-23issuecomment-2D210255765&d=CwMCaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=OqHpa3V9w4cx_nGZ_9TC2Q&m=A-x704gc_WkRFNA0xLw-2stpOCZ35TxCEG0Dx-SPBcY&s=9txFOGeXI3HKM-ipTSLkI8ICkUp84Q1-5icOQ-4NRvs&e=

smadha commented 8 years ago

Don't we already have 10k sarcastic tweets? @TheSidhesh

TheSidhesh commented 8 years ago

There are a lot of repetitions in them so effectively there would be much lesser distinct tweets

smadha commented 8 years ago

Add stemming in pre processing