recodehive / Scrape-ML

For new data generation Semi-supervised-sequence-learning-Project we have writtern a python script to fetch📊, data from the 💻, imdb website 🌐 and converted into txt files.
https://scrape-ml.streamlit.app/
MIT License
108 stars 136 forks source link

GSSOC '24 : Python News Data Scrapping #33

Closed sujanrupu closed 4 months ago

sujanrupu commented 6 months ago

Description: We propose the development of a Python-based news data analysis tool. This project aims to analyze news data from various sources and provide insights into trends, sentiments, and key topics.

Features:

Implementation:

github-actions[bot] commented 6 months ago

Thank you for raising a issue, Hope you enjoing the open source. we try to reply or assign as soon possibe. Connect with mentor.

viththagi commented 6 months ago

hi @sanjay-kv i would like to work on this issue my steps would be:

1.web scraping using libraries such as beautifulsoup,selenium 2.Understand the Website Structure to inspect the HTML of the comments section. 3.efficiency consideration: Add delays between requests to avoid overloading the website's server. Handle pagination 4.sentiment analysis: using libraries like TextBlob or NLTK.

github-actions[bot] commented 4 months ago

This issue has been automatically closed because it has been inactive for more than 30 days. If you believe this is still relevant, feel free to reopen it or create a new one. Thank you!