Closed AndyTheFactory closed 1 year ago
Issue by Jooey233 Mon Apr 3 16:20:52 2023 Originally opened as https://github.com/codelucas/newspaper/issues/968
https://gnews.org/articles/1068907 used article.text for this page, and no text got. and build for gnews is not working too.
article.text
import newspaper gnews = newspaper.build('https://gnews.org/', language='zh') article = gnews.articles[0] article.download() article.parse() print(article.text)
it will give u a 'index out of range'
and this:
import newspaper page = newspaper.Article('https://gnews.org/articles/1065912') page.download() page.parse() print(page.title) print(page.text)
Only part of the title is caught, the text is not working at all
Site has Cloudflair protection. it returns status_code 200, so it looks "ok" for the downloader...
Issue by Jooey233 Mon Apr 3 16:20:52 2023 Originally opened as https://github.com/codelucas/newspaper/issues/968
https://gnews.org/articles/1068907 used
article.text
for this page, and no text got. and build for gnews is not working too.it will give u a 'index out of range'
and this:
Only part of the title is caught, the text is not working at all