danmactough / node-feedparser

Robust RSS, Atom, and RDF feed parsing in Node.js
Other
1.97k stars 192 forks source link

it cannot parse xml in rss url which is not ending by 'xml'? #164

Closed yao23 closed 8 years ago

yao23 commented 8 years ago

I tried

rss_url = 'https://itunes.apple.com/us/rss/customerreviews/id=1002815646/sortBy=mostRecent/xml';

it works successfully, but it doesn't work for a url which doesn't include 'xml', any idea or work around?

danmactough commented 8 years ago

@yao23 feedparser doesn't care what the url is -- it can parse whatever xml you feed into it. What is an actual url you tried that didn't work?

yao23 commented 8 years ago

the url is a private one, it can log inside req.on

req.on('response', function (res) {
   var stream = this;

   if (res.statusCode != 200) return this.emit('error', new Error('Bad status code'));
   console.log('piping feedparser');
   stream.pipe(feedparser); 
   console.log(res.toString());
});

but it doesn't trigger feed parser.on block

feedparser.on('readable', function() { 
  console.log('printing readable streaming');
  // This is where the action is!
  var stream = this
    , meta = this.meta // **NOTE** the "meta" is always available in the context of the feed parser instance
    , item;

  while (item = stream.read())  {
    console.log(JSON.stringify(item));
  }
});

Any idea?

yao23 commented 8 years ago

I log more and found it complains the url is not a feed, but it actually has status code 200 which is logged before stream.pipe(feedparser)

danmactough commented 8 years ago

Status code 200 doesn't tell you whether or not it's xml, though. I understand the URL is private -- can you post the content though? Maybe a pastebin or gist?

yao23 commented 8 years ago

I use some other ways to get the data, thanks a lot.