codelucas / newspaper

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
https://goo.gl/VX41yK
MIT License
13.89k stars 2.1k forks source link

not working for gnews.org #968

Open Jooey233 opened 1 year ago

Jooey233 commented 1 year ago

https://gnews.org/articles/1068907 used article.text for this page, and no text got. and build for gnews is not working too.

import newspaper

gnews = newspaper.build('https://gnews.org/', language='zh')

article = gnews.articles[0]
article.download()
article.parse()
print(article.text)

it will give u a 'index out of range'

and this:

import newspaper

page = newspaper.Article('https://gnews.org/articles/1065912')
page.download()
page.parse()
print(page.title)
print(page.text)

Only part of the title is caught, the text is not working at all