DanRoscigno / Recipes

0 stars 0 forks source link

Indexing #21

Closed DanRoscigno closed 3 months ago

DanRoscigno commented 3 months ago

Adds a Golang crawler using colly and the Algolia API client for Go.

This is a WIP.

I suspect that there are mandatory fields (docusaurus_tag maybe?) missing from my index.

This is a curl copied from Chrome Developer Tools while searching from Docusaurus. It returns results when used against the index populated by the Docker based crawler and it also returns results when used against the index populated with colly.

This suggests that I need to look at the Docusaurus search widget, as it is not displaying results while I type. The search is for vin (vinegar):

curl 'https://r7frlk17be-dsn.algolia.net/1/indexes/*/queries?x-algolia-agent=Algolia%20for%20JavaScript%20(4.23.3)%3B%20Browser%20(lite)%3B%20docsearch%20(3.6.0)%3B%20docsearch-react%20(3.6.0)%3B%20docusaurus%20(3.3.2)&x-algolia-api-key=f502d25c73799a0d014a044bd65945d2&x-algolia-application-id=R7FRLK17BE' \
  -H 'Accept: */*' \
  -H 'Accept-Language: en-US,en;q=0.9' \
  -H 'Connection: keep-alive' \
  -H 'Origin: https://danroscigno.github.io' \
  -H 'Referer: https://danroscigno.github.io/' \
  -H 'Sec-Fetch-Dest: empty' \
  -H 'Sec-Fetch-Mode: cors' \
  -H 'Sec-Fetch-Site: cross-site' \
  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36' \
  -H 'content-type: application/x-www-form-urlencoded' \
  -H 'sec-ch-ua: "Not/A)Brand";v="8", "Chromium";v="126", "Google Chrome";v="126"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "macOS"' \
  --data-raw '{"requests":[{"query":"vin","indexName":"recipes_crawled_golang","params":"attributesToRetrieve=%5B%22hierarchy.lvl0%22%2C%22hierarchy.lvl1%22%2C%22hierarchy.lvl2%22%2C%22hierarchy.lvl3%22%2C%22hierarchy.lvl4%22%2C%22hierarchy.lvl5%22%2C%22hierarchy.lvl6%22%2C%22content%22%2C%22type%22%2C%22url%22%5D&attributesToSnippet=%5B%22hierarchy.lvl1%3A10%22%2C%22hierarchy.lvl2%3A10%22%2C%22hierarchy.lvl3%3A10%22%2C%22hierarchy.lvl4%3A10%22%2C%22hierarchy.lvl5%3A10%22%2C%22hierarchy.lvl6%3A10%22%2C%22content%3A10%22%5D&snippetEllipsisText=%E2%80%A6&highlightPreTag=%3Cmark%3E&highlightPostTag=%3C%2Fmark%3E&hitsPerPage=1000&clickAnalytics=false&facetFilters=%5B%5D"}]}'