danmactough / node-feedparser

Robust RSS, Atom, and RDF feed parsing in Node.js
Other
1.97k stars 192 forks source link

doesn't parser other languages based url apart from english based urls #250

Closed heloscream closed 6 years ago

heloscream commented 6 years ago

eg> http://feeds.bbc.co.uk/hindi/index.xml works fine, but when I try something URL which has some special character like other languages like (http://xyz-sites/category/se%C3%A7%C3%B5es) throw me 502 error

let pullRequest = request(req.query.q, {timeout: 10000, pool: false}, function(error, response, body) {
  if(response) {
    status = response.statusCode; // Print the response status code if a response was received
    if(status === 200) {
       statusResponse = { message: 'request success'}
    }
  } else if( error ) {
      status = 400;
      statusResponse = { message: 'bad Request 400'}
  }
});
pullRequest.setMaxListeners(50);
  // Some feeds do not respond without user-agent and accept headers.
pullRequest.setHeader('user-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36');
pullRequest.setHeader('accept', 'text/html,application/xhtml+xml');

let feedparser = new FeedParser();
// Define our handlers
pullRequest.on('error', done);
pullRequest.on('response', function(res) {
let encoding = res.headers['content-encoding'] || 'identity'
      , charset = getParams(res.headers['content-type'] || '').charset;
  res = maybeDecompress(res, encoding);
  res = maybeTranslate(res, charset);
  res.pipe(feedparser);
});

feedparser.on('error',done); // this line throw 502 error 

the URL that I 'm passing is valid eg http://xyz-sites/category/se%C3%A7%C3%B5es, but turns out just because of URL encoding like mentioned above (% ) signs but it was not, I tried js decodeURI methods as well in decodeURI(req.query.q) resulted in ( http://xyz-sites/category/seções) but still getting 502 error, tried everything but getting nowhere, help would be appreciated, thanks in advance

danmactough commented 6 years ago

@heloscream If you're getting a 502 error, that means the remote server is experiencing a major error. That isn't related to feedparser.

heloscream commented 6 years ago

thank for replying, but if I paste this URL into a browser, it shows RSS, and also if I print err .stack, it will say no a Feed Error: Not a feed at FeedParser.handleEnd (/home/hemant/Desktop/feeds/node_modules/feedparser/lib/feedparser/index.js:119:13) at emitNone (events.js:106:13) at SAXStream.emit (events.js:208:7) at SAXParser.SAXStream._parser.onend (/home/hemant/Desktop/feeds/node_modules/sax/lib/sax.js:190:10) at emit (/home/hemant/Desktop/feeds/node_modules/sax/lib/sax.js:624:35) at end (/home/hemant/Desktop/feeds/node_modules/sax/lib/sax.js:667:5) at SAXParser.end (/home/hemant/Desktop/feeds/node_modules/sax/lib/sax.js:154:24) at SAXStream.end (/home/hemant/Desktop/feeds/node_modules/sax/lib/sax.js:248:18) at FeedParser._flush (/home/hemant/Desktop/feeds/node_modules/feedparser/lib/feedparser/index.js:1089:17) at FeedParser. (/home/hemant/Desktop/feeds/node_modules/readable-stream/lib/_stream_transform.js:138:49)

danmactough commented 6 years ago

thank for replying, but if I paste this URL into a browser, it shows RSS

If you don't share the URL, there's really nothing more I can do to help.

and also if I print err .stack, it will say no a Feed

That's correct: The remote server isn't returning a feed, it's returning a 502 error page.

heloscream commented 6 years ago

sorry from my end, I was using wrong way to fetch from querying URL, eg pullRequest = request(encodeURI(req.query.q), {timeout: 10000, pool: false}, function(error, response, body) { // I didn't use encodeURI( ) method , for this kind of urls http://xyz- sites/category/se%C3%A7%C3%B5es if(response) { status = response.statusCode; // Print the response status code if a response was received if(status === 200) { statusResponse = { message: 'request success'} } else if(status === 404) { statusResponse = { message: 'request page not found !!!'} } else { statusResponse = { message: 'Indernal server request error'} } } else if( error ) { status = 400; statusResponse = { message: 'bad Request 400'} } });

now everything works smoothly n cool thanks a lot for ur support