codelucas / newspaper

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
https://goo.gl/VX41yK
MIT License
14.06k stars 2.11k forks source link

encoding error : input conversion failed due to input error, bytes 0x21 0x00 0x00 0x00 #760

Open ashwinsingh2007 opened 4 years ago

ashwinsingh2007 commented 4 years ago

Tried this link on local with newspaper3k link: http://www.news.com.au/sport/cricket/big-bash/bbl-2019-perth-scorchers-vs-melbourne-renegades-at-optus-stadium/live-coverage/c76e315c694d39dd5c20ad75c5a136aa

my code :

article_content = Article('http://www.news.com.au/sport/cricket/big-bash/bbl-2019-perth-scorchers-vs-melbourne-renegades-at-optus-stadium/live-coverage/c76e315c694d39dd5c20ad75c5a136aa
', keep_article_html=True)

article_content.download() -> this throws error

article_content.parse()

However, I earlier was working with same code on Jupyter notebook and it was working fine. then i shifted code to flask environment. article_content.download() gives error on few url like mentioned above and on few it works

blw1138 commented 2 years ago

I'm hitting this exact issue on a lot of articles

AlphaMoury commented 1 year ago

Any progress so far on this issue?