plone / guillotina

Python AsyncIO data API to manage billions of resources
187 stars 50 forks source link

Searching by "SearchableText" is broken #1151

Closed frapell closed 2 years ago

frapell commented 2 years ago

I've been trying to figure out how to fix this, and I can't quite make it... Searching by an attribute, like the title, works just fine

$ curl -i -X GET "" --user root:root
HTTP/1.1 200 OK
date: Tue, 12 Oct 2021 20:37:59 GMT
server: uvicorn
Content-Type: application/json
Access-Control-Allow-Credentials: true
Access-Control-Expose-Headers: *
Server: Guillotina/6.3.16.dev0
Transfer-Encoding: chunked


However searching by SearchableText fails

$ curl -i -X GET "" --user root:root
HTTP/1.1 500 Internal Server Error
date: Tue, 12 Oct 2021 20:38:07 GMT
server: uvicorn
Content-Type: application/json
Transfer-Encoding: chunked

With the following Traceback

Traceback (most recent call last):
  File "/trabajo/plone/guillotina/guillotina/middlewares/", line 30, in __call__
    resp = await self.next_app(scope, receive, _send)
  File "/trabajo/plone/guillotina/guillotina/", line 167, in __call__
    return await self.request_handler(request)
  File "/trabajo/plone/guillotina/guillotina/", line 190, in request_handler
    resp = await route.handler(request)
  File "/trabajo/plone/guillotina/guillotina/", line 265, in handler
    view_result = await self.view()
  File "/trabajo/plone/guillotina/guillotina/api/", line 191, in _call_validate
    return await self._call_original()
  File "/trabajo/plone/guillotina/guillotina/api/", line 197, in _call_original_func
    return await self.__original__(self.context, self.request)
  File "/trabajo/plone/guillotina/guillotina/api/", line 98, in search_get
    return await, dict(request.query))
  File "/trabajo/plone/guillotina/guillotina/contrib/catalog/pg/", line 291, in search
    parsed_query = parse_query(context, query, self)
  File "/trabajo/plone/guillotina/guillotina/catalog/", line 100, in parse_query
    return parser(query)
  File "/trabajo/plone/guillotina/guillotina/contrib/catalog/pg/", line 149, in __call__
    result = self.process_queried_field(field, value)
  File "/trabajo/plone/guillotina/guillotina/contrib/catalog/pg/", line 51, in process_queried_field
    return self.process_compound_field(field, value, " OR ")
  File "/trabajo/plone/guillotina/guillotina/contrib/catalog/pg/", line 31, in process_compound_field
    parsed_value = urllib.parse.parse_qsl(urllib.parse.unquote(value))
  File "/usr/lib/python3.8/urllib/", line 637, in unquote
AttributeError: 'dict' object has no attribute 'split'

I believe this is related to this PR however trying guillotina from before that change also fails, but with a different Traceback

Traceback (most recent call last):
  File "/trabajo/plone/guillotina/guillotina/middlewares/", line 30, in __call__
    resp = await self.next_app(scope, receive, _send)
  File "/trabajo/plone/guillotina/guillotina/", line 167, in __call__
    return await self.request_handler(request)
  File "/trabajo/plone/guillotina/guillotina/", line 190, in request_handler
    resp = await route.handler(request)
  File "/trabajo/plone/guillotina/guillotina/", line 265, in handler
    view_result = await self.view()
  File "/trabajo/plone/guillotina/guillotina/api/", line 191, in _call_validate
    return await self._call_original()
  File "/trabajo/plone/guillotina/guillotina/api/", line 197, in _call_original_func
    return await self.__original__(self.context, self.request)
  File "/trabajo/plone/guillotina/guillotina/api/", line 98, in search_get
    return await, dict(request.query))
  File "/trabajo/plone/guillotina/guillotina/contrib/catalog/pg/", line 284, in search
    parsed_query = parse_query(context, query, self)
  File "/trabajo/plone/guillotina/guillotina/catalog/", line 100, in parse_query
    return parser(query)
  File "/trabajo/plone/guillotina/guillotina/contrib/catalog/pg/", line 141, in __call__
    sql, values, select, field = result
ValueError: not enough values to unpack (expected 4, got 3)

If I place a breakpoint in I see

(Pdb) field,value
('title__in', 'News')

When searching using the title, however, when trying with SearchableText I see

(Pdb) field,value
('searchabletext__or', {'title__in': 'News'})

When using this looks like this

(Pdb) field,value
('searchabletext__or', {'text__in': 'News', 'title__in': 'News', 'url__in': 'News'})

And value being a dict happens in

As you might have guessed, I am trying to fix the Search when using Volto + guillotina-volto, which uses SearchableText.

Any advice on where to fix this? I believe improving the process_compound_field method should be the way to go, what do you think?

frapell commented 2 years ago

In trying to understand the changes of PR #1132 I don't think that's how the __or is intended to be used? According to the test, it seems the query is expected to be /db/guillotina/@search?__or=title%3DFirst item%26title%3DSecond item However, I think it should be /db/guillotina/@search?title__or=First item,Second item @bloodbare what do you think?

masipcat commented 2 years ago

Any advice on where to fix this? I believe improving the process_compound_field method should be the way to go, what do you think?

I think this is the right place. Can you try if changing the L31-33 with this code solves the problem?

if isinstance(value, dict):
    parsed_value = value.items()
    parsed_value = urllib.parse.parse_qsl(urllib.parse.unquote(value))
    if not isinstance(parsed_value, list):
        return None
frapell commented 2 years ago

@masipcat Yup, that works like a charm!

masipcat commented 2 years ago

@masipcat Yup, that works like a charm!

Nice :) do you want to open a PR with this changes and if you don't mind, add a test for this case?

frapell commented 2 years ago

@masipcat Sure... How can I test against postgres? should I edit the TESTING_SETTINGS manually?

masipcat commented 2 years ago

Just set the env var DATABAES=postgres:

$ DATABASE=postgres pytest guillotina/tests
masipcat commented 2 years ago

you need to have docker installed and running. A pytest fixture will create the postgres container automatically