qdrant / qdrant-client

Python client for Qdrant vector search engine
https://qdrant.tech
Apache License 2.0
673 stars 106 forks source link

Unable to use metadata filter values containing spaces #631

Open ms130 opened 1 month ago

ms130 commented 1 month ago

I'm using the Qdrant document store with Haystack, and would like to apply a metadata filter inside a query pipeline. This works fine when meatadata values don't contain any white spaces, however breaks when they do. With spaces, parse_json_path() raises a ValueError: Invalid path, see code here.

In my case, metadata values with spaces occurs because I have predicted category labels for documents that sometimes are more than a single word.

It appears parse_json_path() only accepts json paths with dot notation, and not also bracket notation. I believe if it were made more flexible to also accept json paths using bracket notation, providing a path with keys containing spaces would not raise an error.

This change would be very useful and make the Qdrant more versatile in handling more varied metadata values as filters.

joein commented 1 month ago

Hi @ms130, indeed, this functionality is not currently supported in qdrant.

Btw, it seems like you're using qdrant in local mode. Local mode mimics the behaviour of the server (e.g. qdrant in docker). We'll add it to the local mode as soon as we add it to the server.

@generall @xzfc

xzfc commented 1 month ago

Quoted field names are supported, e.g. to match {"a b": {"c d": "x"}}, use this:

models.FieldCondition(key='"a b"."c d"', match=models.MatchValue(value="x"))
ms130 commented 1 month ago

@xzfc thanks it works if I use that formatting. Not sure if it's already mentioned in the docs, but would be helpful to mention this as the way to use field names containing spaces.