laurengarcia / url-metadata

NPM module: Request a url and scrape the metadata from its HTML using Node.js or the browser.
https://www.npmjs.com/package/url-metadata
MIT License
166 stars 44 forks source link

Certain URLs cause Maximum call stack size exceeded #36

Closed mauronr closed 1 year ago

mauronr commented 3 years ago

Filing a new issue as I can't reopen #6

Some URLs cause a RangeError: Maximum call stack size exceeded An example of URL is https://lnkd.in/gVeYnv7

The error message is below (the stack trace is truncated, shoing only the snippet below)

/<redacted>/node_modules/domutils/lib/querying.js:83
function findAll(test, elems){
                ^

RangeError: Maximum call stack size exceeded
    at findAll (/<redacted>/node_modules/domutils/lib/querying.js:83:17)
    at findAll (/<redacted>/node_modules/domutils/lib/querying.js:90:27)
    at findAll (/<redacted>/node_modules/domutils/lib/querying.js:90:27)
    at findAll (/<redacted>/node_modules/domutils/lib/querying.js:90:27)
    at findAll (/<redacted>/node_modules/domutils/lib/querying.js:90:27)
    at findAll (/<redacted>/node_modules/domutils/lib/querying.js:90:27)
    at findAll (/<redacted>/node_modules/domutils/lib/querying.js:90:27)
    at findAll (/<redacted>/node_modules/domutils/lib/querying.js:90:27)
    at findAll (/<redacted>/node_modules/domutils/lib/querying.js:90:27)
    at findAll (/<redacted>/node_modules/domutils/lib/querying.js:90:27)
chienwen commented 3 years ago

This is because the URL is not a text file. To avoid that, I change the code a bit to process only when the response header content-type matches /^text\//, for example text/html.

https://github.com/chienwen/url-metadata/commit/f3df2087d8343db88b70266f04cff26e5e40d38f

Trunksome commented 3 years ago

Just experienced the same problem. And although I have wrapped the call to "urlMetadata" in a try-catch, it made the whole nodejs app shut down. Is there anything I can do to prevent this happening today already?

mauronr commented 3 years ago

This is because the URL is not a text file. To avoid that, I change the code a bit to process only when the response header content-type matches /^text\//, for example text/html.

chienwen@f3df208

@chienwen Do you plan to submit the fix from your forked repo back in here?

chienwen commented 3 years ago

@mauronr yes, this fix is included in PR #37

mauronr commented 3 years ago

Great! I'm sorry I missed checking the PRs before asking. And thanks for looking into it.

bieblebrox commented 3 years ago

Any update on when this PR will be merged, experiencing the same issue it's kind of blocking?

laurengarcia commented 1 year ago

Custom headers are now available as an option in new version 3.0.1 https://www.npmjs.com/package/url-metadata