Closed YCKang closed 2 years ago
Hello. I added a content-type
check in open-graph-scraper@4.10.0
. Let me know if that fixes your problem.
Code -> https://github.com/jshemas/openGraphScraper/blob/master/lib/request.js#L21-L23
Thank you, it fixes the problem. And the "downloadLimit" feature is also awesome!
Hi, @jshemas.
Currently, this module check the url using
utils.isThisANonHTMLUrl(options.url)
before request the url. But some url is a file link and the extension is not in theinvalidImageTypes
array, or the link has no extension even. This module may cause high CPU usage due to parsed a non HTML link (actually is a file).Although, some rare case the content-type may not exists in the response header #45 ('https://www.namecheap.com/' add the content-type in the response header now) I think the misjudge of the nonHTML link is more often than the HTML link has no content-type. Maybe you should add the check back ?
p.s. I found another tool that only accept 'text/html', 'application/xhtml+xml'. https://github.com/niallkennedy/open-graph-protocol-tools/blob/ac1f238f52088be9fb220df0dd9ef3b2fb452b82/open-graph-protocol.php#L548
thx.