damoeb / rss-proxy

RSS-proxy allows you to do create an RSS or ATOM feed of almost any website, just by analyzing just the static HTML structure.
https://rssproxy.migor.org
1.77k stars 109 forks source link

full feed? #9

Closed somedevreally closed 3 years ago

somedevreally commented 3 years ago

Great tool, thank you. will it be easy to have a full feed available? while retrieving the rss

for example https://www.airforcetimes.com/news/

damoeb commented 3 years ago

I have started to work on that. I am extracting the fulltext from the referenced link using the readability library (https://github.com/mozilla/readability) which works quite decent. Unfortunately its more complicated, cause resolving a feed will most likely exceed the request timeout, so it has to be implemented async.

somedevreally commented 3 years ago

great, you are looking in to it. thanks you might want to check this one too for full feed. https://github.com/HenryQW/mercury_fulltext

thanks again, for this tool

damoeb commented 3 years ago

I finished a POC but unfortunately it breaks the concept of rss-proxy (rp). To deliver a good experience with rp, which currently does not have any internal states, all requests would be required to resolve the links as full-articles feed within the request timeout. This is very hacky and won't work for slow or large feeds. Moreover throttling requests is not possible with this approach, and websites might start blocking rp.

A better approach is based on subscriptions, in which a subscription to a feed/website will be internally resolved to a full feed. I will release this solution in a different project called rich-RSS https://github.com/damoeb/rich-rss.

I will close the issue.

somedevreally commented 3 years ago

can't open the demo site, so does rich-rss can be replace rss-proxy or we have to use both?

thanks

damoeb commented 3 years ago

Its not yet live, maybe this weekend I will find some time to do it. 'rss-proxy' will be used internally, along with 'rss-bridge' which performs better for some mainstream sites. You can use both, but if you want to get fulltext feeds you will have to switch to 'rich-RSS'. I keep you posted.