AshwinSankar17 / NewsCluster

Scrape and cluster news based on the headlines
MIT License
1 stars 2 forks source link

Article.text returns empty string #8

Open AshwinSankar17 opened 4 years ago

AshwinSankar17 commented 4 years ago

This is an issue with the newspaper module. Link to issue here

PandaWhoCodes commented 4 years ago

Which webites are you having this issue ?

AshwinSankar17 commented 4 years ago

I use these root links: LINKS = ['https://timesofindia.indiatimes.com/', 'https://www.thehindu.com/', 'https://www.bbc.com/news', 'https://www.theguardian.co.uk/']

I use a prebuilt url scraper for scraping the articles. Sometimes the title and text are similar to the ones below:

583,The Guardian,What term do you want to search? Search with google
551,Video + News,What term do you want to search? Search with google
170,Local News,Get the news that’s local to you
99,Times Now Live TV: Watch Coronavirus Cases in India Live News,"Sorry, this content is not available in your country"