arrrlo / Google-Images-Search

[PYTHON] Search for image using Google Custom Search API and resize & crop afterwards
MIT License
176 stars 34 forks source link

Not getting correct results #170

Open securitypedant opened 1 year ago

securitypedant commented 1 year ago

I just installed this module and it's exactly what I need. However, the results it returns don't match the results from a regular Google search.

I am doing a search for "wayfair red sofa". If I do this search on Google.com, I get a very long list of red sofas.

https://www.google.com/search?q=wayfair%20red%20sofa&tbm=isch&hl=en&tbs&sa=X&ved=0CAEQpwVqFwoTCJjUttDZ2v0CFQAAAAAdAAAAABAD&biw=1477&bih=924

If I use the CX code from the Google Developer console...

And embed this in my page and search for "wayfair red sofa" I also get a list of red sofas.

However, when I call your API with the following parameters...

search_params = {
    'q': 'wayfair red sofa',
    'num': 10,
    'fileType': 'jpg|gif|png',
    'rights': 'cc_publicdomain|cc_attribute|cc_sharealike|cc_noncommercial|cc_nonderived',
    'safe': 'safeUndefined', ##
    'imgType': 'photo', ##
    'imgSize': 'medium', ##
    'imgDominantColor': 'imgDominantColorUndefined', ##
    'imgColorType': 'color' ##
}

# this will only search for images:
gis.search(search_params=search_params)
results = gis.results()

I don't get any red sofas. It's really odd. Surely your call should get pretty much the same results as if I was typing the query direct into Google.com.

Am I doing something wrong?

william-davies commented 1 year ago

I also noticed a discrepancy between Google Images in my browser and results from this library. The quality of the library results tends to be worse.

    search_params = {
        "q": "manzana",
        "num": 5,  # Number of images to retrieve (in this case, 1)
        "safe": "off",  # Set the safe search level (optional)
    }
    gis.search(search_params=search_params)

I get these images: ['https://upload.wikimedia.org/wikipedia/commons/thumb/0/02/Manzana.svg/2048px-Manzana.svg.png', 'https://manzanaproductsco.com/wp-content/uploads/2017/09/apple-pouches-slide1-rev.png', 'https://images.heb.com/is/image/HEBGrocery/000849586-1', 'https://cdn.shopify.com/s/files/1/0608/4424/5227/products/5120201831.jpg?v=1637191548', 'https://manzanaproductsco.com/wp-content/uploads/2018/07/manzana-cluster.png']. 4/5 are of apple JUICE. The results on Google Images seem much better.

Thank you! Would appreciate any tips on how to get better/more accurate results.

lludwig commented 9 months ago

this doesn't appear to be anything to do with the library but Google's own Search API and the differences in what end users see.