serpapi / public-roadmap

Public Roadmap for SerpApi, LLC (https://serpapi.com)
53 stars 5 forks source link

[Google Search] - Images Search | Images_Results Original Field Does Not Always Contain Direct Link to Image #684

Open evaddas opened 1 year ago

evaddas commented 1 year ago

When executing a Google Search Search Type as "isch - Google Images", the original Field in the images_results list does not always contain a direct link to the image.

In some cases, the original field links to a page with the image embedded in an <img> tag.

This makes it impossible to download the image using the link.

See the example for a search with keywords "superhero girl cartoon tv show" via the Playground.

{
  "position": 4,
  "thumbnail": "https://serpapi.com/searches/63ffb4f09b647283d1e44e9e/images/a80330a94e2a854e4ff90bed76ad01226eeadffc833effb0d96742e74101a42c.jpeg",
  "source": "DC Super Hero Girls Wikia - Fandom",
  "title": "Season 2 (TV series) | DC Super Hero Girls Wikia | Fandom",
  "link": "https://dcsuperherogirls.fandom.com/wiki/Season_2_%28TV_series%29",
  "original": "https://static.wikia.nocookie.net/dc-superherogirls/images/0/08/DC_Super_Hero_Girls_Season_2.jpg/revision/latest?cb=20210814092121",
  "original_width": 2000,
  "original_height": 1500,
  "is_product": false
}
ilyazub commented 1 year ago

Hi @evaddas, the image from original field from your example is downloadable. Please share an ID of a search with unexpected original images.

Other images are also valid for me. How to reproduce this bug?

curl -s --compressed "https://serpapi.com/search?q=superhero+girl+cartoon+tv+show&tbm=isch&api_key=$SERPAPI_KEY" \
  | jq -cr '.images_results[].original' \
  | xargs -i -P10 wget -P /tmp/download_images_$(date +%s)/ {}

ls /tmp/download_images_1678195371/
# Omitted... Lists 100 images.

P.S. I've formatted the JSON snippet in your post.