nodeca / probe-image-size

Get image size without full download. Supported image types: JPG, GIF, PNG, WebP, BMP, TIFF, SVG, PSD, ICO.
MIT License
978 stars 77 forks source link

How can I probe these images? #73

Closed verlok closed 2 years ago

verlok commented 2 years ago

I'm building this open source tool based on your library. But sometimes it probing images, e.g. when I try to probe these two:

If I try with:

curl -A 'test' https://www.dunhill.com/product_image/45634378WS/f/w508_bffffff.jpg

I get an HTML response back

<HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>

You don't have permission to access "http&#58;&#47;&#47;www&#46;dunhill&#46;com&#47;product&#95;image&#47;45634378WS&#47;f&#47;w508&#95;bffffff&#46;jpg" on this server.<P>
Reference&#32;&#35;18&#46;8b0c1502&#46;1650662288&#46;5bba5dc
</BODY>
</HTML>

Any suggestion on how I could cheat and get the image response?

Thank you in advance for your help

verlok commented 2 years ago

Additional info

I tried with

curl -A 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36 Edg/100.0.1185.50' https://www.dunhill.com/product_image/45634378WS/f/w508_bffffff.jpg

And I got the expected binary output!

but if I do

const naturalWidth = await probe(imgUrl, {
  response_timeout: 100,
  follow_max: 5, // follow up to five redirects,
  rejectUnauthorized: true, // verify SSL certificate
  user_agent:
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36 Edg/100.0.1185.50",
})
  .then((res) => {
    console.log("...succeded");
    return res.width;
  })
  .catch((err) => {
    console.log("...failed");
    console.error(err);
    return 0;
  });

What am I doing wrong and how can I make this work with your library?

Thanks

verlok commented 2 years ago

I found the solution. This particular server requires additional headers.

From my experiments, the minimum code required to make the above work is:

const imgIntrinsicWidth = await probe(imgUrl, {
  headers: {
    accept: "image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8",
    "accept-encoding": "gzip, deflate, br",
    referer: "",
    "accept-language": "en-GB,en;q=0.9,en-US;q=0.8,it;q=0.7",
    "user-agent":
      "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36 Edg/100.0.1185.50",
  },
})
  .then((res) => {
    console.log("...succeded");
    return res.width;
  })
  .catch((err) => {
    console.log("...failed");
    console.error(err);
    return 0;
  });
puzrin commented 2 years ago

Let's close this, since where are no goal to cheat with headers by default.