Scraping support for Reddit [GSSoC'23]

Prachi-Jain01 commented 1 year ago

Proposed Method:

Create a wrapper over Reddit API to add support for scraping Reddit.

Directory:

scrape-up/src/scrape_up/reddit

I would like to work on this issue as a part of GSSoC'23. @nikhil25803 Could you please assign it to me?

Prachi-Jain01 commented 1 year ago

@nikhil25803 Could you please assign this issue to me?

shubham725809 commented 1 year ago

please assign this to me

nikhil25803 commented 1 year ago

Sure @Prachi-Jain01, go give it a try :))

BabarRasheed commented 1 year ago

Hi, I'm Babar Rasheed (Contributor GSSOC'23) Many websites don't offer API so to tackle this we can use Web Scraping to access data in easy and structured manner. Python libraries like bs4, BeautifulSoup, Scrapy, Selenium, etc. are generally used for web scraping. Here I'm willing to apply these libraries and use an effective way of Multiprocessing to speed up Web Scraping. Multiprocessing is very helpful when multiple URLs are scraped to get the data. It will perform scraping on multiple URLs thus saving our time.

Clueless-Community / scrape-up

Scraping support for Reddit [GSSoC'23] #173

Proposed Method:

Directory: