languages - Githubissues

gsiolas commented 8 years ago

is there a way for users to provide or help you provide new languages to the ntlk mashape api? thx!

japerk commented 8 years ago

The best thing that could help would be to send me a link to a good training corpus. The API is generally based on the NLTK corpora, except for sentiment, which uses other movie review corpora. So if you know of a good training corpus for your language, let me know, and I'll see what I can do.

gsiolas commented 8 years ago

Hi Jacob, do you need training corpora or something like that https://github.com/MKLab-ITI/greek-sentiment-lexicon (link 2 https://github.com/MKLab-ITI/greek-sentiment-lexicon/blob/master/greek_sentiment_lexicon.tsv) could be of use? an already sentiment-rated lexicon... Giorgos

On Wed, Feb 17, 2016 at 12:04 AM, Jacob Perkins notifications@github.com wrote:

The best thing that could help would be to send me a link to a good training corpus. The API is generally based on the NLTK corpora, except for sentiment, which uses other movie review corpora. So if you know of a good training corpus for your language, let me know, and I'll see what I can do.

— Reply to this email directly or view it on GitHub https://github.com/japerk/nltk3-cookbook/issues/2#issuecomment-184893072 .

japerk commented 8 years ago

Thanks for that. Unfortunately, I don't have a setup yet for keyword based sentiment analysis. What I need is something like the polarity dataset here: https://www.cs.cornell.edu/people/pabo/movie-review-data/. The ideal structure is sentences or paragraphs, each classified as pos or neg.

japerk / nltk3-cookbook

languages #2