llmware-ai / llmware

Unified framework for building enterprise RAG pipelines with small, specialized models
https://llmware-ai.github.io/llmware/
Apache License 2.0
4.48k stars 821 forks source link

array out of bounds error in retrieval #453

Open chair300 opened 6 months ago

chair300 commented 6 months ago

When making a RAG request to of a semantic query I experienced the following stack trace. I am able to reproduce this.

File "/llmware/llmware/retrieval.py", line 670, in semantic_query results_dict = self._cursor_to_qr(query, qr_raw, result_count=result_count) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/llmware/llmware/retrieval.py", line 578, in _cursor_to_qr matches_found = self.locate_query_match(query, raw_qr["text"]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/llmware/llmware/retrieval.py", line 1375, in locate_query_match if core_text[x].lower() == key_term[0].lower():


IndexError: string index out of range
MacOS commented 6 months ago

Can you please post a self-contained reproducible example so I can take a look at it?

chair300 commented 6 months ago

I am able to reproduce the issue if I have an extra space or double space in the the query text.

ucekmez commented 6 months ago

Please see https://github.com/llmware-ai/llmware/pull/470/commits/4f6a7b5402b5a1bd3751292d0bcba79340da0e23 which fixes the mentioned bug (inside https://github.com/llmware-ai/llmware/pull/470 pr)

chair300 commented 6 months ago

This does not fix the bug. I have submitted the fix already with PR#471