AsyrafAzlan / SentiLexM

Malay (and English) lexicon for solving sentiment analysis tasks.
MIT License
3 stars 0 forks source link

Telco Dataset used to develop SentiLexM #2

Open iannho opened 3 years ago

iannho commented 3 years ago

Hi Asyraf,

I could not find any other way to contact you so I'm hoping that leaving a message here would be okay. I'm currently working on a study on Sentiment Analysis and Dr Tan is my co-supervisor. She shared your study with me and has asked me to request from you the telco dataset that you performed sentiment analysis on. Please do let me know if you need more information or would like to discuss this with me. Looking forward to hear from you.

AsyrafAzlan commented 3 years ago

Hi @iannho ,

It's been quite some years and I'm not quite sure if it's the final version of the dataset I used back then but right now, this is the only file I have related to it => https://www.dropbox.com/s/8mvnruhxbea0qgb/StreamTwitDB.csv?dl=0

Good luck with your research.

iannho commented 3 years ago

Hi Asyraf,

Amazing. Thank you very much!

iannho commented 3 years ago

Hey Asyraf,

Wanted to ask you one more question. How did you deal with pre-processing the corpus? What library did you use to stem/lemmatise?

AsyrafAzlan commented 3 years ago

Again, I don't have a concrete answer for this because it's been quite some time but if I remember correctly, I was using NLTK as well as my own functions to stem/lemmatise.

There wasn't much option at the time though nowadays I would suggest that you look at https://github.com/huseinzol05/malaya . He's already built quite a good Malay NLP toolkit.

Also, since you have my contact already, I think it's much more convenient to exchange messages via email rather than opening an issue here.