danmactough / node-feedparser

Robust RSS, Atom, and RDF feed parsing in Node.js
Other
1.97k stars 190 forks source link

Relative links within HTML not resolved #275

Open mimecuvalo opened 5 years ago

mimecuvalo commented 5 years ago

FeedParser version: 2.2.9 Node version: 11.0.0 Link to feed exhibiting the issue: http://feeds.kottke.org/main

Search for any of the tags within the Kottke feed and you'll see they're all relative. The xml:base is specified on the tag containing it but it's not being applied to the HTML contained within.

I'm migrating my codebase from Python->JS so I'm used to Python's FeedParser handling this. I've hacked around it for now via a regex in my codebase but walking the XML structure/AST (however you're parsing it in your codebase) properly is the better solution :)