Open reuschling opened 7 months ago
@reuschling I will introduce ignoreMissing
and throw an error only if ignoreMissng is false and none of the fields in the context_field_list is present. How does that sound?
This sounds like a great solution, thanks a lot.
In my index, most documents have the field 'body', and sometimes also 'title' and 'description'. Because the data is crawled, we can not make sure that there is valid data for each document. Nevertheless it would be nice if e.g. 'description' will be considered for generating the answer if there is one.
Currently, the existence of a field specified in the "context_field_list" of the rag pipeline is mandatory. I get the Error: [ERROR][o.o.s.q.g.GenerativeQAResponseProcessor] [port-4106] Context description not found in search hit { "_index" : "exampleIndex", "_id" : "docId_0", "_score" : 0.7, "_source" : { "body" : " ....someText" ....
I know I could add empty fields to my documents, but one of the key concepts in OpenSearch/Lucene is that not all documents must follow the same 'data schema'. This is also valid for the search, where only documents with matching fields will be returned.
So, in terms of consistency and robustness please allow fields inside "context_field_list" that don't have to appear in all result documents.