Readability algorithm - Githubissues

Anonyfox / elixir-scrape

Scrape any website, article or RSS/Atom Feed with ease!

https://github.com/Anonyfox/elixir-scrape

GNU Lesser General Public License v3.0

327 stars 43 forks source link

Readability algorithm #16

Closed Anonyfox closed 5 years ago

Anonyfox commented 8 years ago

Currently, fetching the full text from a HTML article is more or less a dirty hack to get the keywords/tags going. https://github.com/keepcosmos/readability seems to be a straight port from arc90's readability algorithm, could deliver better results while also increasing the computation demand.

Anonyfox commented 8 years ago

waiting for https://github.com/keepcosmos/readability/issues/8 for now.

noma4i commented 8 years ago

Correspondent issue is resolved now.

Anonyfox commented 5 years ago

added readability in v3