ibm-cloud-docs / discovery-data

1 stars 13 forks source link

ä, ü, ö not supported when filtering a discovery query #24

Open LordHeImchen opened 5 months ago

LordHeImchen commented 5 months ago

Hi everyone, currently I am trying to filter a natural language query to Watson Discovery by metadata-values. Everything works fine, except when the value I am trying to filter for contains ä, ü or ö... Probably there are some other cases that would fail as well.

So in Python, my function looks like this:

def query_discovery(tender, query):
    queryresult = discovery.query(
        collection_ids=[DISCOVERY_COLLECTION_ID],
        project_id=DISCOVERY_PROJECT,
        return_=["passage_text"],
        filter=f'metadata.tender::{tender}',
        natural_language_query=query,
        passages=QueryLargePassages(max_per_document=1, per_document=True),
    ).get_result()
    return queryresult

When I query it with any query and tender="Frankfurt" it works. However, using the value tender="München" will always result in zero outputs. I also tried to accomplish it using a curl command, but same issue. Additionally I tried to use the unicode for the ü, so basically I provided this value tender="M\u00fcnchen", but this will throw an error because of the \.

FYI:

Is there any way to resolve this?