AndyTheFactory / newspaper4k

📰 Newspaper4k a fork of the beloved Newspaper3k. Extraction of articles, titles, and metadata from news websites.
MIT License
485 stars 50 forks source link

encoding error : input conversion failed due to input error, bytes 0x21 0x00 0x00 0x00 #420

Closed AndyTheFactory closed 1 year ago

AndyTheFactory commented 1 year ago

Issue by ashwinsingh2007 Sun Dec 22 20:28:58 2019 Originally opened as https://github.com/codelucas/newspaper/issues/760


Tried this link on local with newspaper3k link: http://www.news.com.au/sport/cricket/big-bash/bbl-2019-perth-scorchers-vs-melbourne-renegades-at-optus-stadium/live-coverage/c76e315c694d39dd5c20ad75c5a136aa

my code :

article_content = Article('http://www.news.com.au/sport/cricket/big-bash/bbl-2019-perth-scorchers-vs-melbourne-renegades-at-optus-stadium/live-coverage/c76e315c694d39dd5c20ad75c5a136aa
', keep_article_html=True)

article_content.download() -> this throws error

article_content.parse()

However, I earlier was working with same code on Jupyter notebook and it was working fine. then i shifted code to flask environment. article_content.download() gives error on few url like mentioned above and on few it works

AndyTheFactory commented 1 year ago

Comment by blw1138 Fri May 13 20:36:41 2022


I'm hitting this exact issue on a lot of articles

AndyTheFactory commented 1 year ago

Comment by AlphaMoury Mon Sep 4 19:45:42 2023


Any progress so far on this issue?

AndyTheFactory commented 1 year ago

Site is protected