codelucas / newspaper

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
https://goo.gl/VX41yK
MIT License
14.09k stars 2.11k forks source link

How to get the list of all websites that are available for scraping? #903

Open aleksandar-devedzic opened 3 years ago

aleksandar-devedzic commented 3 years ago

Is there a way to get a list of websites that can be crawled property with newspaper lib? For example newspaper.sources or something like tha?

tspier commented 3 years ago

Maybe one of the two files here? https://github.com/codelucas/newspaper/tree/master/newspaper/resources/misc