danmactough / node-feedparser

Robust RSS, Atom, and RDF feed parsing in Node.js
Other
1.97k stars 192 forks source link

Problem with parsing "image" property #178

Closed alabeduarte closed 7 years ago

alabeduarte commented 7 years ago

Hi,

I found an issue during fetching images from a feed. I've got empty javascript objects.

The issue can be reproduced with the code snippet bellow, using feedparser v1.1.4:

var FeedParser = require('feedparser')
  , request = require('request');

var req = request('https://jovemnerd.com.br/feed-nerdcast/')
  , feedparser = new FeedParser();

req.on('response', function (res) {
  var stream = this;

  if (res.statusCode != 200) return this.emit('error', new Error('Bad status code'));

  stream.pipe(feedparser);
});

feedparser.on('readable', function() {  
  var stream = this
    , item;

  while (item = stream.read()) { 
    // should print the right value here but currently the result is Object {}  

    console.log(item.image);
  }
});

I'm wondering if the expected result, according to the docs should be:

{ link: "...", url: "...", title: "..." }
tschellenbach commented 7 years ago

similar result with this feed: https://dribbble.com/shots/popular.rss all the articles have images but the library somehow ignores them

tschellenbach commented 7 years ago

btw, this is a pretty amazing library, thank you for the hard work. we're going to create an open source demo app with this library and Sails, React/Redux and getstream.io

danmactough commented 7 years ago

@alabeduarte Sorry for the long delay in replying. You happened to post your issue while I was on vacation and I never caught up, I guess. 😊

The feed you linked to doesn't appear to have images in the items, only a feed-level image, which is available on the meta object.

feedparser.on('readable', function() {  
  var stream = this
    , item;

  while (item = stream.read()) { 
    console.log(this.meta.image); // should log the image many, many times because that feed is AWESOME and includes every podcast from its inception!
  }
});
danmactough commented 7 years ago

@tschellenbach Thank you for your kind feedback (and waking up this thread)!

I see the same issue in your feed as I mentioned above: no image tag on the item, only at the feed level.

In your feed's case, the description in made up of HTML that happens to include an HTML <img> tag, but we don't parse the HTML. I hope that makes sense.

alabeduarte commented 7 years ago

Thanks @danmactough