nt1m / livemarks

Extension that restores RSS Feed Livemarks in Firefox.
https://addons.mozilla.org/firefox/addon/livemarks/
MIT License
229 stars 23 forks source link

Consider supporting invalid URLs only starting with `www.` without a scheme #237

Open maxoku opened 4 years ago

maxoku commented 4 years ago

Feed URL: https://www.waterfox.net/feed.xml

Add-on version: 2.8

Describe the bug It creates a folder, but doesn't fetch any items. Main item has a strange address https://www.waterfox.net/www.waterfox.net/. I use Waterfox Current. In Waterfox Classic which is based on old Firefox the build-in subscription works properly and preview displays all items properly, so it might be an add-on bug.

To Reproduce Steps to reproduce the behavior:

  1. Go to 'https://www.waterfox.net/'
  2. Click on 'feed icon' at the bottom of the page
  3. Add Livemark
  4. Go to created folder

Expected behavior Live bookmark folder should be properly created.

nt1m commented 4 years ago

Thanks for reporting!

This feed is invalid according to the specification, URLs should either include the scheme ("https://" or "http://") or should use real relative URLs (so "/blog/waterfox-2020.04-release/" with nothing else). Right now, it uses the "www.waterfox.net/blog/waterfox-2020.04-release/" format, which does neither...

The fact that it works in some feed readers is because they special case this invalid URL format, this could be done, but I think this blog should really fix its feed.

I did notice Livemarks doesn't support relative URLs for Atom feeds, but that's a different issue that's separate from this one.

maxoku commented 4 years ago

That's interesting, you're right. I guess other feed readers just add "https://" automatically when it's missed. Would that be hard? Even if not automatically, cuz it might not detect it by itself easily, but maybe manually with an option to add that scheme when set?

nt1m commented 4 years ago

@maxoku I think it's tricky, because someone may use /blog/waterfox-2020.04-release/ (which is valid), in which case, you want to add the complete host: https://www.waterfox.net/ before it.

The main challenge is determining when to add https://www.waterfox.net/ or when to just add https://. It'd be easy to special-case www. but then it wouldn't work if someone just uses waterfox.net/blog/waterfox-2020.04-release/ (without www.). It might be possible to use a regex here, not sure

maxoku commented 4 years ago

I mean there could be an option to manually choose what to add per feed. I was thinking exactly the same that there could be many possibilities how feed would be made. So if users could specify by themselves that than just only let it automatically determine. If automatic detection is tricky and would take more time then maybe release first the manual function and take your time with autodetection.

Btw, it's not connected, but would there be a possibility to change http with https manually? I mean if someone forgets to add https:// to the feed then can as well forgot to add s in https://, right?

nt1m commented 4 years ago

I mean there could be an option to manually choose what to add per feed. I was thinking exactly the same that there could be many possibilities how feed would be made. So if users could specify by themselves that than just only let it automatically determine.

A manual setting wouldn't fix the issue for the feed preview page, at that point, the setting is not set by the user yet. Not to mention, a single feed contains multiple URLs, which may have different URL formats.

Automatic detection would take as much time as adding a manual setting. I don't have much time this month unfortunately.