Closed synzen closed 5 years ago
Not sure what your asking, but if fetch returns encoded (gzip) you need to decode it before piping to feedparser. When I do this using a mikeal request, this feed parses just fine. Note, this feed is not encoded.
node-fetch
doesn't auto-decode the body unless prompted by the user (which is why it's lightweight and faster from my own tests - like the browser's fetch API) - instead, I let feedparser take on the body. node-fetch
works over 99.999% of the time (it works reliably, and is lighter than other libs!) until I encountered this issue, where feedparser failed to detect an invalid feed for a feed like this (had me scratching me head for days as to what was happening).
You can see the link above has invalid content-encoding headers when you open it in chrome and you see ERR_CONTENT_DECODING_FAILED
and the explanation for it: https://stackoverflow.com/questions/14039804/error-330-neterr-content-decoding-failed. TLDR; response says its gzipped, and it's actually not - browser is confused. I assume feedparser is confused here as well, but instead of giving an error, feedparser just hangs indefinitely
Not a node-fetch user, but parses fine as a mikeal request (feed header in this case indicates that is not encoded).
I get that request
works, as you've said before, but I'm saying the problem is that feedparser isn't emitting any errors here, as I mentioned in my last sentence
Yea, just catching up to subtlety of your problem.
I just set myself up to use node-fetch. When I run it with your feed and {compress:false} option, it works.
Yes it will work in that case because then their server will send an uncompressed body, with no gzip in content-encoding
(the main problem with their feed)
but the point here is that feedparser is not emitting an error here when I really think it should (if it's possible) - the issue is basically being swallowed up by the parser. I'm not trying to find a way to work with the feed, just trying to suggest that feedparser should emit an error so
I think Feedparser is just waiting for input. I think node-fetch maybe hung, not handling the decoding correctly in your offending feeds case. If that's the case Feedparser may not be the place for the fix.
@synzen This doesn't appear to be related to feedparser.
Make the following change to your example script:
- res.body.pipe(feedparser)
+ res.body.pipe(process.stdout)
You'll still see the pathological behavior -- no output, no errors. 🤷♂
Here I used node-fetch to with
Accept-Encoding: gzip,deflate
(the default) for a feed that sends a erroneously sends back a response withcontent-encoding: gzip
when the body is not actually compressed. Unfortunately it seems that feedparser doesn't detect such a problem here. node-fetch doesn't throw any errors for this unless you callres.text()
,res.json()
(see https://github.com/bitinn/node-fetch/issues/45), and instead I immediately pipe it to feedparser rather than letting node-fetch handle itIn my example below feedparser doesn't emit error and doesn't emit end.
feedparser v2.2.9 Node v12.7.0
Feed: https://eliteesp.es/category/galnet/feed/
There is no output when running this. Not sure if feedparser is supposed to handle this case though