Currently, fetching the full text from a HTML article is more or less a dirty hack to get the keywords/tags going. https://github.com/keepcosmos/readability seems to be a straight port from arc90's readability algorithm, could deliver better results while also increasing the computation demand.
Currently, fetching the full text from a HTML article is more or less a dirty hack to get the keywords/tags going. https://github.com/keepcosmos/readability seems to be a straight port from arc90's readability algorithm, could deliver better results while also increasing the computation demand.