abhisheks008 / ML-Crate

ML-Crate stands as the ultimate hub for a multitude of exciting ML projects, serving as the go-to resource haven for passionate and dedicated ML enthusiasts!🌟💫 Devfolio URL, https://devfolio.co/projects/mlcrate-98f9
https://quine.sh/repo/abhisheks008-ML-Crate-409463050
MIT License
205 stars 216 forks source link

Twitter sentiment analysis #495

Open Yuvika-14 opened 10 months ago

Yuvika-14 commented 10 months ago

ML-Crate Repository (Proposing new issue)

Project Title :Twitter sentiment analysis Aim :to predict whether a comment is positive or negative on twitter. Dataset:https://www.kaggle.com/datasets/kazanova/sentiment140 Approach:ensemble methods, gradient boosting, neural networks. Take care of the missing data if there are any categorical values then will use one hot encoder or label encoder depending upon the issue.


📍 Follow the Guidelines to Contribute in the Project :


:red_circle::yellow_circle: Points to Note :


:white_check_mark: To be Mentioned while taking the issue :


Happy Contributing 🚀

All the best. Enjoy your open source journey ahead. 😎

hemant933 commented 10 months ago

Full name :Hemant chaudhary GitHub Profile Link : github.com/hemant933 Participant ID (If not, then put NA) : Approach for this Project : first we need to process the data to handle any missing data , than apply models specified by using vectorization and pipeling . What is your participant role? (Mention the Open Source Program name. Eg. HRSoC, GSSoC, GSOC etc.) IWOC

so , plz assign it to me

abhisheks008 commented 10 months ago

Hi @Yuvika-14 this project is already present in this project repo, https://github.com/abhisheks008/ML-Crate/tree/main/Sentimental%20Analysis%20of%20tweets.

If you wanna enhance this project then you can share your approach.

JagritiGautam793 commented 10 months ago

Full name :Jagriti Gautam GitHub Profile Link : https://github.com/JagritiGautam793 Participant ID (If not, then put NA) : Approach for this Project :After processing of the data i will try for the real time analysis of the tweets by using twitter api (Tweepy)and after then applying nlp and pretrained transformer model after that visualizing the data through donut charts and also analysing through word cloud..

What is your participant role? (Mention the Open Source Program name. Eg. HRSoC, GSSoC, GSOC etc.) IWOC

so , plz assign it to me

JagritiGautam793 commented 10 months ago

Can u assign me @abhisheks008

abhisheks008 commented 10 months ago

Please check the previous comments.

JagritiGautam793 commented 10 months ago

Sir i will improve this model by using Roberta Model(from hugging face) rather than vader ....This transformer model account for words but also the context related to words .As human language depend more on context ... Vader is not that much accurate in analyzing the context ..Every negative comment may not be negative but sarcastic Roberta is more efficient in getting that.. Sir pls if u can assign me and guide through it

abhisheks008 commented 10 months ago

Cool, assigned to you @JagritiGautam793

abhisheks008 commented 9 months ago

Unassigned as the open source event ended up.

JagritiGautam793 commented 9 months ago

Sir i am almost done .. Was the event date not upto 15 th

abhisheks008 commented 9 months ago

No @JagritiGautam793 IWOC 2024 deadline was Feb 11th, 2024 23:59 hours.

abhisheks008 commented 9 months ago

This issue is not being assigned to you as the program has already completed, hence the assignment has been removed.

abhisheks008 commented 9 months ago

@JagritiGautam793

shivansh-2003 commented 6 months ago

Can You Please Assign this issue under SSOC. 2024 Season 3 Shivansh Mahajan Github:- https://github.com/shivansh-2003 Participation ID:- NA I will first convert the file into text procced file then tokenize each element of text after that i would use different encoding methods like count vectorizer by sklearn textprocessing by tensorflow and word2vec by gensin library then i would feed the encoded file to possible LSTM , RNN neural network to draw sentimnetal analysis I have been Recently Doing Few NLP Projects on NER , Sentimental Analysis , Text Classifcation I am well versed with fundamentals of NLP check out my linkedin :-https://www.linkedin.com/in/shivansh-mahajan-13227824a/ and Git repository . My some recent Project in NLP Projects https://www.linkedin.com/feed/update/urn:li:activity:7199784822737682432/(SPAM classifier) https://www.linkedin.com/feed/update/urn:li:activity:7201206409605091328/ . (NLP APP) can u assign me with this issue @abhisheks008 Participation Role:- SSOC Season 3

abhisheks008 commented 6 months ago

Contributions will start from June 1, 2024. Till then please have some patience.

Tanishka023 commented 5 months ago

Full name : Tanishka Bhalla GitHub Profile Link: https://github.com/Tanishka023 Participant ID (If not, then put NA) :NA Approach for this Project : Extract data from Twitter API -> Processing the data -> Feature Extraction -> NB/CNN/LR -> Model Training -> Evaluation What is your participant role? SSoC Season 3

abhisheks008 commented 5 months ago

Full name : Tanishka Bhalla GitHub Profile Link: https://github.com/Tanishka023 Participant ID (If not, then put NA) :NA Approach for this Project : Extract data from Twitter API -> Processing the data -> Feature Extraction -> NB/CNN/LR -> Model Training -> Evaluation What is your participant role? SSoC Season 3

Can you implement 3-4 models for this project?

Tanishka023 commented 5 months ago

Full name : Tanishka Bhalla GitHub Profile Link: https://github.com/Tanishka023 Participant ID (If not, then put NA) :NA Approach for this Project : Extract data from Twitter API -> Processing the data -> Feature Extraction -> NB/CNN/LR -> Model Training -> Evaluation What is your participant role? SSoC Season 3

Can you implement 3-4 models for this project?

Yes! @abhisheks008

abhisheks008 commented 5 months ago

Full name : Tanishka Bhalla GitHub Profile Link: https://github.com/Tanishka023 Participant ID (If not, then put NA) :NA Approach for this Project : Extract data from Twitter API -> Processing the data -> Feature Extraction -> NB/CNN/LR -> Model Training -> Evaluation What is your participant role? SSoC Season 3

Can you implement 3-4 models for this project?

Yes! @abhisheks008

One issue at a time.

shivamkrishna1000 commented 5 months ago

Full Name : Shivam Krishna GitHub Profile Link : https://github.com/shivamkrishna1000 Participant ID : NA Approach for this Project : First I will handle the missing values if any -> use label encoding if needed -> use TfidfVectorizer for feature extraction -> Use various models such as XGB/Random Forest Classification/LR/SVM/Gradient Boosting Classifier for model training -> Model evaluation based on accuracy score. What is your participant role? SSoC Season 3

Please assign me this issue.

why-aditi commented 5 months ago

Aditi Kala Github:- https://github.com/why-aditi Participation ID:- NA Approach: Text Cleaning: Remove unnecessary elements from the tweets such as: URLs, Hashtags, Punctuation, Numbers, Special characters, Remove common words that do not contribute much to the sentiment Tokenization: Split the text into individual words or tokens. Sentiment analysis: VADER Extract sentiment scores or labels (e.g., positive, negative, neutral) Participation Role:- SSOC Season 3

abhisheks008 commented 5 months ago

Full Name : Shivam Krishna GitHub Profile Link : https://github.com/shivamkrishna1000 Participant ID : NA Approach for this Project : First I will handle the missing values if any -> use label encoding if needed -> use TfidfVectorizer for feature extraction -> Use various models such as XGB/Random Forest Classification/LR/SVM/Gradient Boosting Classifier for model training -> Model evaluation based on accuracy score. What is your participant role? SSoC Season 3

Please assign me this issue.

Implement 5-6 models for this dataset.

Assigned @shivamkrishna1000

shivamkrishna1000 commented 5 months ago

Full Name : Shivam Krishna GitHub Profile Link : https://github.com/shivamkrishna1000 Participant ID : NA Approach for this Project : First I will handle the missing values if any -> use label encoding if needed -> use TfidfVectorizer for feature extraction -> Use various models such as XGB/Random Forest Classification/LR/SVM/Gradient Boosting Classifier for model training -> Model evaluation based on accuracy score. What is your participant role? SSoC Season 3 Please assign me this issue.

Implement 5-6 models for this dataset.

Assigned @shivamkrishna1000

Sure Sir