opencodeiiita / News_Scraping

5 stars 23 forks source link
beautifulsoup data-preprocessing data-scraping everyone opencode23 pandas

News_Scraping

Please read the instructions.txt carefully before attempting the tasks

A good data scientist not only has extensive knowledge of machine learning, and deep learning, but also has the ability to extract and gather data from various sources and store it in a useable format. This task will introduce you to the first step of all data science tasks, data collection. One method of data collection is web scraping, which you will be working on in this task.

Problem Statement This project involves collecting data from various online sources. You are asked to collect relevant news data on different stocks, collect financial news headlines. The second part of the project is data cleaning and pre processing. You are asked to present a clean and usable dataset.

Instructions

Procedure

  1. Fork and clone this repository onto your local device
  2. Open the .ipynb file on google colab
  3. Once you are done with the task, download as .ipynb and store it in a folder along with required files
  4. Name your file as your Enrollment number
  5. Push this file to forked repo and then send PR
  6. Your code will be reviewed by the mentors. Points will be granted once the PR is accepted and merged

Help

For any query feel free to contact iit2022008@iiita.ac.in. You can also interact with the mentors and the geekhaven community on discord