miso-belica / sumy

Module for automatic summarization of text documents and HTML pages.
https://miso-belica.github.io/sumy/
Apache License 2.0
3.51k stars 529 forks source link

Is it possible to get entire text from link? #163

Closed aryan1107 closed 2 years ago

aryan1107 commented 2 years ago

For example in newspaper python library you can do article.text for getting entire text.

Any similar command for sumy?

miso-belica commented 2 years ago

Hi, sumy is not meant as an article parser. It's just a minor feature to ease the article summarization on the web. The algorithm is far from perfect and uses breadability. You should use the library for the text extraction as newspaper, justText or alternatives in its readme.

But if you really want you can do this to get the full article from the parser: " ".join(map(str, parser.document.sentences)).