codelucas / newspaper

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
https://goo.gl/VX41yK
MIT License
14.06k stars 2.11k forks source link

cannot get related text #901

Open fuzsh opened 3 years ago

fuzsh commented 3 years ago

Describe the bug When I receive information from a site like the one below, it cannot recognize the main part of the text ....

http://www.jahannews.com/news/769201

To Reproduce

url = "http://www.jahannews.com/news/769201"
article = Article(url, language='fa')
article.download()
article.parse()

print(article.text)

Log Return ""

Versions:

johnbumgarner commented 3 years ago

I'm looking into this issue, but I have no affiliation with this project. The owner of this project won't respond to any inquiries about the project's status. So I'm unsure that I can solve your issue, because this project is on life support unless the owner opens the project up for others to contribute more freely.

jiangxinke commented 2 years ago

yes, it seems that it can not hold some keywords in news website, or im not finding it yet