quickwit-oss / tantivy-py

Python bindings for Tantivy
MIT License
273 stars 63 forks source link

tantivy-py throws syntax error if query has `:` #70

Closed saurav-chakravorty closed 1 year ago

saurav-chakravorty commented 1 year ago

Steps to replicate:

import tantivy

# Declaring our schema.
schema_builder = tantivy.SchemaBuilder()
schema_builder.add_text_field("title", stored=True)
schema_builder.add_text_field("body", stored=True)
schema_builder.add_integer_field("doc_id",stored=True)
schema = schema_builder.build()

# Creating our index (in memory)
index = tantivy.Index(schema)
writer = index.writer()
writer.add_document(tantivy.Document(
    doc_id=1,
    title=["The Old Man and the Sea"],
    body=["""He was an old man who fished alone in a skiff in the Gulf Stream and he had gone eighty-four days now without taking a fish."""],
))
# ... and committing
writer.commit()

# Reload the index to ensure it points to the last commit.
index.reload()
searcher = index.searcher()

query = index.parse_query("fish days:", ["title", "body"])
(best_score, best_doc_address) = searcher.search(query, 3).hits[0]
best_doc = searcher.doc(best_doc_address)
assert best_doc["title"] == ["The Old Man and the Sea"]
print(best_doc)

Output:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File <command-3495552511963062>:25
     22 index.reload()
     23 searcher = index.searcher()
---> 25 query = index.parse_query("fish days:", ["title", "body"])
     26 (best_score, best_doc_address) = searcher.search(query, 3).hits[0]
     27 best_doc = searcher.doc(best_doc_address)
PSeitz commented 1 year ago

: is a reserved keyword to search in fields. days: means search in field days. You can quote it to escape it

cjrh commented 1 year ago

@saurav-chakravorty Are you ok for us to close this issue?

saurav-chakravorty commented 1 year ago

yes absolutely. I am closing it.