serpapi / public-roadmap

Public Roadmap for SerpApi, LLC (https://serpapi.com)
50 stars 3 forks source link

[New Bing Search API] Bing Image Visual search #399

Open andypple83 opened 1 year ago

andypple83 commented 1 year ago

Let's support Bing image search. We can search by keyword:

Screen Shot 2022-10-15 at 14 18 53

Or image file/url:

Screen Shot 2022-10-15 at 14 21 17

hartator commented 1 year ago

Adding old Canny.io link for reference: https://forum.serpapi.com/feature-requests/p/add-bing-image-search-api

ilyazub commented 1 year ago

Reverse image search results are obtained through the /images/api/custom/knowledge API endpoint.

Related content tab

Bash script

curl 'https://www.bing.com/images/api/custom/knowledge?q=&rshighlight=true&textDecorations=true&internalFeatures=share&FORM=SBIHMP&skey=aeCUosWoYCPQ7_1wAr_HO_O746BHv_fxH8UAUBIvp1k&safeSearch=Moderate&mkt=en-ww&setLang=en-us&IG=0EE09CCA53BE4ACAA95BD9B092C45CF3&IID=idpins&SFX=1' \
  -s \
  -H 'authority: www.bing.com' \
  -H 'accept: */*' \
  -H 'accept-language: en-US,en;q=0.8' \
  -H 'content-type: multipart/form-data; boundary=----WebKitFormBoundaryybS6WM5BOLApSZSk' \
  -H 'cookie: SUID=M; MUID=1FF2510B90A86CE407E0438891A86D52; MUIDB=1FF2510B90A86CE407E0438891A86D52; _EDGE_S=F=1&SID=3701616DA2BD6F32240D73EEA3BD6E8E; _EDGE_V=1; SRCHD=AF=SBIHMP; SRCHUID=V=2&GUID=963F1DC6DA524E3C8E2B419534E8DD4F&dmnchg=1; SRCHUSR=DOB=20221223; _SS=SID=3701616DA2BD6F32240D73EEA3BD6E8E; MMCASM=ID=FF993A37A0AC4FA4AD7DAD03BC6E8D52; SRCHHPGUSR=SRCHLANG=en&BRW=HTP&BRH=M&CW=976&CH=730&SCW=976&SCH=730&DPR=1.3&UTC=120&DM=0' \
  -H 'origin: https://www.bing.com' \
  -H 'pragma: no-cache' \
  -H 'referer: https://www.bing.com/images/search?view=detailV2&insightstoken=bcid_RLKVsIV2BwkFXg*ccid_spWwhXYH&form=SBIHMP&iss=SBIUPLOADGET&sbisrc=ImgPicker&idpbck=1&sbifsz=927+x+524+%c2%b7+25.15+kB+%c2%b7+png&sbifnm=serpapi-serpbear.png&thw=927&thh=524&ptime=223&dlen=34344&expw=798&exph=451&selectedindex=0&id=-1051855017&ccid=spWwhXYH&vt=2&sim=11' \
  -H 'sec-fetch-dest: empty' \
  -H 'sec-fetch-mode: cors' \
  -H 'sec-fetch-site: same-origin' \
  -H 'sec-gpc: 1' \
  -H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36' \
  --data-raw $'------WebKitFormBoundaryybS6WM5BOLApSZSk\r\nContent-Disposition: form-data; name="knowledgeRequest"\r\n\r\n{"imageInfo":{"imageInsightsToken":"bcid_RLKVsIV2BwkFXg*ccid_spWwhXYH","source":"Gallery"},"knowledgeRequest":{"invokedSkills":["ImageById","BestRepresentativeQuery","Offline","ObjectDetection","OCR","EntityLinkingFace","EntityLinkingDog","EntityLinkingAnimal","EntityLinkingPlant","EntityLinkingLandmark","EntityLinkingFood","EntityLinkingBook","SimilarImages","RelatedSearches","PagesIncluding","TextAds","ProductAds","SponsoredAds","Annotation","Recipes","Travel"],"invokedSkillsRequestData":{"adsRequest":{"textRequest":{"mainlineAdsMaxCount":2}}},"index":1}}\r\n------WebKitFormBoundaryybS6WM5BOLApSZSk--\r\n' \
  --compressed | jq -cr '.tags[] | select(.displayName == "") | .actions[] | select(.actionType == "VisualSearch") | .data.value[] | { name, cDNContentUrl }'

Not all HTTP headers are necessary, for sure.

Output

{"name":"What is Lead Management | Lead Management Process | Freshsales","cDNContentUrl":"https://th.bing.com/th/id/R.b9f567b5396fa177a4f4dc31be4ee7df?rik=QfNkfgDMOLdGug&pid=ImgRaw&r=0"}
{"name":"Sandglaz","cDNContentUrl":"https://th.bing.com/th/id/R.0b9cd275326b080d90611c41eed0ba53?rik=ZNQETsfqL%2bnJ%2bA&pid=ImgRaw&r=0"}
{"name":"How to Use Live Chat Auto Started Chats | onWebChat","cDNContentUrl":"https://th.bing.com/th/id/R.9c0f994e00e6a557d2bff1766f395960?rik=tYN%2f%2fgBqV8F1NA&pid=ImgRaw&r=0"}

# Omitted ...

Screenshot

image

MIT-licensed open-source parser

https://sourcegraph.com/github.com/d4n3436/Fergun@88a5b915f7b09c61a52655d605a77f9139088841/-/blob/src/Apis/Bing/BingVisualSearch.cs?L77-89

Text tab

Response schema on Azure documentation: https://github.com/MicrosoftDocs/azure-docs/blob/65798f88a769256202438ed9f956d5ecd48c918a/articles/cognitive-services/bing-visual-search/concepts/sending-queries.md#text-recognition

Bash script

curl 'https://www.bing.com/images/api/custom/knowledge?q=&rshighlight=true&textDecorations=true&internalFeatures=share&FORM=SBIHMP&skey=aeCUosWoYCPQ7_1wAr_HO_O746BHv_fxH8UAUBIvp1k&safeSearch=Moderate&mkt=en-ww&setLang=en-us&IG=0EE09CCA53BE4ACAA95BD9B092C45CF3&IID=idpins&SFX=1' \
  -s \
  -H 'authority: www.bing.com' \
  -H 'accept: */*' \
  -H 'accept-language: en-US,en;q=0.8' \
  -H 'content-type: multipart/form-data; boundary=----WebKitFormBoundaryybS6WM5BOLApSZSk' \
  -H 'cookie: SUID=M; MUID=1FF2510B90A86CE407E0438891A86D52; MUIDB=1FF2510B90A86CE407E0438891A86D52; _EDGE_S=F=1&SID=3701616DA2BD6F32240D73EEA3BD6E8E; _EDGE_V=1; SRCHD=AF=SBIHMP; SRCHUID=V=2&GUID=963F1DC6DA524E3C8E2B419534E8DD4F&dmnchg=1; SRCHUSR=DOB=20221223; _SS=SID=3701616DA2BD6F32240D73EEA3BD6E8E; MMCASM=ID=FF993A37A0AC4FA4AD7DAD03BC6E8D52; SRCHHPGUSR=SRCHLANG=en&BRW=HTP&BRH=M&CW=976&CH=730&SCW=976&SCH=730&DPR=1.3&UTC=120&DM=0' \
  -H 'origin: https://www.bing.com' \
  -H 'pragma: no-cache' \
  -H 'referer: https://www.bing.com/images/search?view=detailV2&insightstoken=bcid_RLKVsIV2BwkFXg*ccid_spWwhXYH&form=SBIHMP&iss=SBIUPLOADGET&sbisrc=ImgPicker&idpbck=1&sbifsz=927+x+524+%c2%b7+25.15+kB+%c2%b7+png&sbifnm=serpapi-serpbear.png&thw=927&thh=524&ptime=223&dlen=34344&expw=798&exph=451&selectedindex=0&id=-1051855017&ccid=spWwhXYH&vt=2&sim=11' \
  -H 'sec-fetch-dest: empty' \
  -H 'sec-fetch-mode: cors' \
  -H 'sec-fetch-site: same-origin' \
  -H 'sec-gpc: 1' \
  -H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36' \
  --data-raw $'------WebKitFormBoundaryybS6WM5BOLApSZSk\r\nContent-Disposition: form-data; name="knowledgeRequest"\r\n\r\n{"imageInfo":{"imageInsightsToken":"bcid_RLKVsIV2BwkFXg*ccid_spWwhXYH","source":"Gallery"},"knowledgeRequest":{"invokedSkills":["ImageById","BestRepresentativeQuery","Offline","ObjectDetection","OCR","EntityLinkingFace","EntityLinkingDog","EntityLinkingAnimal","EntityLinkingPlant","EntityLinkingLandmark","EntityLinkingFood","EntityLinkingBook","SimilarImages","RelatedSearches","PagesIncluding","TextAds","ProductAds","SponsoredAds","Annotation","Recipes","Travel"],"invokedSkillsRequestData":{"adsRequest":{"textRequest":{"mainlineAdsMaxCount":2}}},"index":1}}\r\n------WebKitFormBoundaryybS6WM5BOLApSZSk--\r\n' \
  --compressed | jq -cr '.tags[] | select(.displayName == "##TextRecognition") | .actions[] | select(.actionType == "TextRecognition") | .data.regions[].lines[] | { text }'

Output

{"text":"README. md"}
{"text":"SerpApi"}
{"text":"SerpBear"}
{"text":"SerpBea"}
{"text":"et notice Position Tracking App. It"}
{"text":"positions in Google and get notified of their positions."}
{"text":"Documentation"}

Screenshot

image

andypple83 commented 1 year ago

Thanks @ilyazub a lot. I looked into this API too, I can not find the JSON containing the Related content - Images.

ilyazub commented 1 year ago

@andypple83 Glad it helps!

I can not find the JSON containing the Related content - Images.

Here's my process:

  1. Ctrl+F in the Network tab of browser dev tools.

    image

  2. Go to the Preview tab of the JSON response

  3. Expand JS object recursively (my Brave Browser doesn't search in collapsed JSON :confused:)

    image

  4. Ctrl+F the target string

    image

  5. Copy property path

    image

  6. Navigate up and down in JS object (with arrow keys) to learn its structure and create an adapter.

  7. Copy as cURL and transform response with jq to check my assumption.


Tbh, Ctrl+Shift+F in the browser dev tools no longer searches across all responses.

image

I've initially proxied the browser network connections via mitmproxy because. Then filtered response bodies with ~bs "TEXT_FROM_THE_HTML_ELEMENT_I_"LOOKING_FOR".

Start mitmproxy with view filter

$ mitmproxy --view-filter '~bs "Freshsales"'

Start chromium-based browser with the target URL and the following flags and parameters

$ brave-browser 'https://www.bing.com/images/search?view=detailV2&insightstoken=bcid_RLKVsIV2BwkFXg*ccid_spWwhXYH&form=SBIHMP&iss=SBIUPLOADGET&sbisrc=ImgPicker&idpbck=1&sbifsz=927+x+524+%c2%b7+25.15+kB+%c2%b7+png&sbifnm=serpapi-serpbear.png&thw=927&thh=524&ptime=223&dlen=34344&expw=798&exph=451&selectedindex=0&id=-1051855017&ccid=spWwhXYH&vt=2&sim=11' --proxy-server='http://127.0.0.1:8080'  --temp-profile -incognito --user-data-dir="`mktemp -d`" --no-first-run --ignore-certificate-errors --allow-insecure-localhost

image

mitmproxy will display the matched requests

image


PS. It's fun to ask ChatGPT about developer tools.

image

hilmanski commented 5 months ago

A high-volume customer requested this feature.

Intercom

martin-serpapi commented 4 months ago

Another high-volume customer requested this:

Intercom

hilmanski commented 1 month ago

A user requested this feature (search via image URL).

Intercom