treygrainger / ai-powered-search

The codebase for the book "AI-Powered Search" (Manning Publications, 2024)
https://aipoweredsearch.com
178 stars 43 forks source link

Problems in `ch07/2.semantic-search.ipynb` #126

Closed alexott closed 8 months ago

alexott commented 9 months ago

If I'm trying to run get_category_and_term_vector_solr_response("kimchi"), I'm getting:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[11], line 1
----> 1 get_category_and_term_vector_solr_response("kimchi")

Cell In[6], line 16, in get_category_and_term_vector_solr_response(keyword)
      1 def get_category_and_term_vector_solr_response(keyword):
      2     query = {
      3         "params": { "fore": keyword, "back": "*:*", "df": "text_t" },
      4         "query": "*:*", "limit": 0,
   (...)
     13                         "type" : "terms", "field" : "doc_type", "limit": 1, "sort": { "r2": "desc" },
     14                         "facet" : { "r2" : "relatedness($fore,$back)"  }}}}}}
---> 16     response = run_search(query)
     17     return json.loads(response)

Cell In[8], line 12, in run_search(text)
     11 def run_search(text):
---> 12     q = urllib.parse.quote(text)
     13     qf, defType = "text_t", "lucene"
     15     return requests.get(SOLR_URL + "/reviews/select?q=" + q + "&qf=" + qf + "&defType=" + defType).text

File /opt/conda/lib/python3.10/urllib/parse.py:869, in quote(string, safe, encoding, errors)
    867     if errors is not None:
    868         raise TypeError("quote() doesn't support 'errors' for bytes")
--> 869 return quote_from_bytes(string, safe)

File /opt/conda/lib/python3.10/urllib/parse.py:894, in quote_from_bytes(bs, safe)
    889 """Like quote(), but accepts a bytes object rather than a str, and does
    890 not perform string-to-bytes encoding.  It always returns an ASCII string.
    891 quote_from_bytes(b'abc def\x3f') -> 'abc%20def%3f'
    892 """
    893 if not isinstance(bs, (bytes, bytearray)):
--> 894     raise TypeError("quote_from_bytes() expected bytes")
    895 if not bs:
    896     return ''

TypeError: quote_from_bytes() expected bytes

Also, run_search is defined later in the notebook, not before this function

treygrainger commented 9 months ago

I'll be working on ch7 thoroughly next week and will review this then.

treygrainger commented 8 months ago

Ok, I fixed this by refactoring the previous run_search and post_search methods, now renamed keyword_search and structured_search respectively. The internal calls in the webserver were calling the now "structured_search" method, which expects json, but the notebook was calling the now "keyword_search" method, which didn't handle it well.

Will push this on an upcoming commit that is refactoring the chapter 7 code base.

treygrainger commented 8 months ago

Resolved