Download additional DATASETS AND TESTING RESOURCES mentioned in README

Deepankar-98 commented 2 years ago

From where can I download the additional DATASETS AND TESTING RESOURCES (items 4-12): mentioned in the README file? https://github.com/cjhutto/vaderSentiment#resources-and-dataset-descriptions

I tried to download the resources using nltk.download('name') but it didn't work the mentioned file names are not there in NLTK Corpura (https://www.nltk.org/nltk_data/)

I am trying to download:

tweets_anonDataRatings.txt,
amazonReviewSnippets_anonDataRatings.txt, etc

Can someone help me with this?

cjhutto commented 2 years ago

Check out the "additional_resources" directory in this repo. The complete set of resources is compressed into the .tar.gz file for your convenience.

Deepankar-98 commented 2 years ago

Thanks a lot for the info and the wonderful package.

Deepankar-98 commented 2 years ago

Hi @cjhutto,

I downloaded the additional datasets but I am unable to figure out how to use it. I figured that I can select the file to access using this code:

from nltk.sentiment.vader import SentimentIntensityAnalyzer sid_mod = SentimentIntensityAnalyzer (lexicon_file="vader_lexicon download path")

The content inside vader_lexicon.txt is of the form:

Whereas tweets_annonDataRatings.txt is:

And tweets_GroundTruth.txt is:

This 2 appear to be just dataset and rating of 20 people. I have 2 questions:

The mean valence between the 2 files are different. Can you please clarify on that?
Is there any way I can use this for sentiment analysis? If Yes then how?

Your help is much appreciated.

cjhutto / vaderSentiment

Download additional DATASETS AND TESTING RESOURCES mentioned in README #139