Move parsing from client to server on cfw

einkoro commented 3 years ago

If parsing pages for feeds was moved from the injected client side script to a server side API we wouldn’t need the entitlement for full access which is apparently scary judging by AppStore reviews complaining about security risks.

This could likely be done on AWS Lambda free tier. Additionally CloudFlare workers could be used to cache at the edge and reduce costs or the entire thing could be handled at the edge by CloudFlare workers. Much better latency (no cold starts) and much more predictable costs with only CloudFlare workers.

Pros:

No more scary permissions warning
Possibility of integrating a service to generate feeds for pages without any
Less memory / cpu overhead or battery usage as we no longer have to poll for changes on the client

Cons:

Additional request per page load
Additional point of failure
Cost of API service
Privacy concern as browsing history essentially sent to third party server
Any feeds for logged in only content wouldn’t be detectable (probably rare)

Machine learning might be viable to sniff out feeds linked on the page that don’t have alternates or for providing feeds from page content when no feeds are available.

einkoro commented 3 years ago

Also worth noting this might be a problem for sites that modify the url on load like a lot of medium sites with cnames that are setting cookies and appending parameters. A good example is or at least was: https://blog.hunter.io/

einkoro commented 3 years ago

Also worth noting many sites publish feeds but do not bother to add the auto discovery markup in the head anymore such as Apple, the BBC and CNN as common examples. This could be done as a known feeds map separately.

This would probably be best as a CloudFlare worker API that’s called when no feeds are discovered. Or alternatively every page and compared with discovered feeds?

Examples: https://www.apple.com/ca/rss/ https://www.bbc.co.uk/news/10628494 https://www.cnn.com/services/rss/

Relevant services, APIs and projects: https://feedsearch.dev/ https://developer.feedly.com/v3/search/ https://github.com/DBeath/feedsearch-crawler https://github.com/ggkovacs/rss-finder

It might even be a better idea to spin off a new extension / web service for this than bolt it over the existing plugin due to the cost to run such a service. A yearly or monthly subscription model would make more sense for such an extension.

einkoro commented 3 years ago

How would we address private feeds such as GH?

einkoro commented 3 years ago

Crawl links with text or href values containing rss, atom, rdf, json, or feed.

bitpiston / rss-button-for-safari

Move parsing from client to server on cfw #51