Open tspier opened 3 years ago
I'm getting the same errors on multiple sites
even i am also facing the same issue, is this repository running?
The article http://www.jumhuriyat.tj/index.php?art_id=44635 cannot be scraped with Newspaper3k. The reason is related to the structure of the HTML, which doesn't provide a clear block of article text to extract.
I'm getting the same errors on multiple sites
@giggioman00 What sites are giving you issues?
even i am also facing the same issue, is this repository running?
@blueshirtdeveloper What sites are giving you issues?
Not sure how dead this thread is, but I'm getting the error on all articles from the following domains (whole article path for easy access):
I'm not sure if it's an issue with the HTML of the website, if there's an issue parsing Tajiki, or something else, but I tried scraping http://www.jumhuriyat.tj/index.php?art_id=44635 on the Heroku demo page and received the following notice: Error converting html to string.