Closed ijkilchenko closed 8 years ago
Yes, there's a bias in the "conclusion" part of the article. You can see the bias here in this line: https://github.com/DataTeaser/textteaser/blob/master/textteaser/parser.py#L26
The reference and the reason why can also be seen in the code comment.
So the position of the sentences in the article is picked according to the hard-coded distribution in that function?
The distributions are hard-coded into the Parser class. Those values come from a research paper cited in the comments.
If I use the Chrome app, the sentences I get (when the number of sentences slider is at its default) seem to all come from the end of the article.
Here are two examples:
url: http://www.lrb.co.uk/v38/n08/john-lanchester/when-bitcoin-grows-up summary:
url: https://en.wikipedia.org/wiki/Automatic_summarization#Current_challenges_in_evaluating_summaries_automatically summary:
In both of the situations, the summaries seem to be generated from sentences at the end. Do you think this has something to do with the Chrome Extension and not your code base or the other way around?