Closed GoogleCodeExporter closed 9 years ago
No, the XML file does not contain any explicit base URI that feedparser can use
to resolve the relative links. I'm under the impression that it's poor form to
use relative URI's but as this is Sam Ruby's site I'm surprised by this -- it
makes me think I'm missing something.
At any rate, the information that I have is that the XML would need to include
an explicit base URI to guarantee that the downloaded file's relative URI's get
resolved correctly. The file doesn't appear to have a base URI set and for that
reason feedparser isn't resolving the relative URI's.
Original comment by kurtmckee
on 10 Jul 2014 at 5:01
So if I understand correctly your comment, Feedparser when fetching the Sam
Ruby feed it fallback do this case:
"...the URL used to retrieve the feed itself is the default base URI for all
relative links within the feed. If the feed was retrieved via an HTTP redirect
(any HTTP 3xx status code), then the final URL of the feed is the default base
URI."
There are - admittedly - few blogs that insist to use relative links in posts.
My hope was to fake the Content-Location header by passing it to the parse
function via the response_headers (or request_headers?).
Basically I'm in a scenario where I have a bunch of feeds already downloaded
via the wonderful Requests package, which is far more robust when it comes to
fetch web resources.
I would like to know if the response_headers param is supposed to be used that
way.
Thanks.
Original comment by and...@passiomatic.com
on 10 Jul 2014 at 6:50
It's supposed to be possible to use the `response_headers` parameter to pass in
what the requests module returns. If that doesn't work as expected please open
a ticket! =)
Original comment by kurtmckee
on 10 Jul 2014 at 10:36
Original issue reported on code.google.com by
and...@passiomatic.com
on 11 Oct 2013 at 9:21