danmactough / node-feedparser

Robust RSS, Atom, and RDF feed parsing in Node.js
Other
1.97k stars 190 forks source link

Atom feed with relative URLs #237

Closed scripting closed 6 years ago

scripting commented 6 years ago

I'm not enough of an expert in Atom to know if relative URLs are permissible but here's one that has them.

http://xmlviewer.scripting.com/?url=http://talkingpointsmemo.com/account/feed/all/LpNewCzTqYYiHqtKqyTw5DaF1aUucgyyXb2dsqKU2yMr

(I wanted to run it through feedvalidator.org, but it appears to be down.)

Not sure what the right answer is. If you feel that TPM should change the feed I will bring the problem to them.

Dave

danmactough commented 6 years ago

Relative URLs are kosher (albeit annoying and foolish). Feedparser is supposed to turn relative URLs into absolute URLs. If it's not working on that feed, it's a bug.

scripting commented 6 years ago

Okay, it's a bug! ;-)

Unless something really weird is happening, which is always a possibility.

It's evident in the river on my blog --

http://scripting.com/river.html

At this moment, the second item in the river comes from TPM. When I click the link it takes me to a page on scripting.com, because (presumably) an uncorrected relative URL passing up from feedparser.

image

danmactough commented 6 years ago

@scripting I'm seeing correct canonical URLs when I parse that feed. What version of feedparser are you using? (Maybe there was a bug and it's now fixed.)

scripting commented 6 years ago

@danmactough --

It looks like the version is 2.2.1.

I did an npm update and now the version is 2.2.3.

So I was indeed running an old version.

I'll relaunch and see if items from that feed have correct URLs.

Sorry for not checking this first.

scripting commented 6 years ago

After installing the new version, we're still getting the bad links.

image

It's totally possible this is a bug in my software, that I'm looking at the wrong item in the object that's being returned, for example.

As far as I know it's only this feed. I can put a workaround in the River browser for it, and am willing to. So I'm going to close this issue for now.

danmactough commented 6 years ago

@scripting I was wrong. It wasn't resolving the correct canonical URL for the first item. 😱

danmactough commented 6 years ago

Fix published as v2.2.4