Closed Prachi-Jain01 closed 1 year ago
@nikhil25803 Could you please assign this issue to me?
please assign this to me
Sure @Prachi-Jain01, go give it a try :))
Hi, I'm Babar Rasheed (Contributor GSSOC'23) Many websites don't offer API so to tackle this we can use Web Scraping to access data in easy and structured manner. Python libraries like bs4, BeautifulSoup, Scrapy, Selenium, etc. are generally used for web scraping. Here I'm willing to apply these libraries and use an effective way of Multiprocessing to speed up Web Scraping. Multiprocessing is very helpful when multiple URLs are scraped to get the data. It will perform scraping on multiple URLs thus saving our time.
Proposed Method:
Create a wrapper over Reddit API to add support for scraping Reddit.
Directory:
scrape-up/src/scrape_up/reddit
I would like to work on this issue as a part of GSSoC'23. @nikhil25803 Could you please assign it to me?