datatau-net / datatau-utils

DataTau utilities
1 stars 1 forks source link

Add functionality to KDNuggets scraper #1

Open pedro-munoz opened 4 years ago

pedro-munoz commented 4 years ago

Add more websites or search deeply on KDNuggets to keep the 1 post/day flow.

There's a long list of potential sources here: https://www.kdnuggets.com/2019/01/active-blogs-ai-analytics-data-science.html

saswat01 commented 4 years ago

can you explain a bit about searching deeply

saswat01 commented 4 years ago

I will PR an example if it looks good i can make it better.

pedro-munoz commented 4 years ago

We look for top posts in Kdnuggets, but if they are already published in DataTau we don't re-post them. So it happened that the scraper didn't post for days, thats why we need some other sites.

saswat01 commented 4 years ago

okay

saswat01 commented 4 years ago

can you name some sites to scrape data from which you may need

pedro-munoz commented 4 years ago

https://www.kdnuggets.com/2019/01/active-blogs-ai-analytics-data-science.html

saswat01 commented 4 years ago

I sent you an example code in which i scraped a part of it if you want i can scrape all of it, please go through the code i will attach an image below

saswat01 commented 4 years ago

gitSend

pedro-munoz commented 4 years ago

Hi saswat, please have a look at the current scraper (populate.py) and try to improve from there. Thanks!