matthewmueller / x-ray

The next web scraper. See through the <html> noise.
MIT License
5.88k stars 350 forks source link

Unhandled error bug and non-html #103

Closed kevindeasis closed 5 years ago

kevindeasis commented 9 years ago

Using these url they will have errors that will not get caught and crash node.

http://cdn.sstatic.net/security/img/apple-touch-icon@2.png?v=497726d850f9&a?v=497726d850f9&a https://www.csie.ntu.edu.tw/~kmchao/bcc03fall/chap08.ppt

I'm guessing errors will happen with any non-html links and links without html body. Whats a good way to fix or handle these errors?

lathropd commented 5 years ago

I’m sorry nobody ever got back to you. Yes, you’d need a selector that avoids requesting a non html page.