Closed mnvsk97 closed 1 month ago
Need clarity on two things:
Need to update chunk metadata with document metadata, if any for all the parsers
How does caching work for parsers with same extension but different config, as the caching key is the file extension.
Not sure about this, need an example.
@mnvsk97 - if you would check get_chunks
function of any parser, it takes file_path
and metadata
(lets call this as doc_metadata
for sake of explanation) as input argument. So when an individual document chunk generates it's own metadata, we should also add the doc_metadata
to the chunk metadata.
Another thing that is missing is,
QueryController's required_metadata
list should be updated to send pre-signed urls in the response.
https://github.com/mnvsk97/cognita/blob/16c20fb006e9065cc435439d77a44bd58e3981a6/backend/modules/query_controllers/base.py#L24
Not sure about this, need an example.
@mnvsk97 - if you would check
get_chunks
function of any parser, it takesfile_path
andmetadata
(lets call this asdoc_metadata
for sake of explanation) as input argument. So when an individual document chunk generates it's own metadata, we should also add thedoc_metadata
to the chunk metadata.Another thing that is missing is, QueryController's
required_metadata
list should be updated to send pre-signed urls in the response. https://github.com/mnvsk97/cognita/blob/16c20fb006e9065cc435439d77a44bd58e3981a6/backend/modules/query_controllers/base.py#L24
Resolved all the above mentioned points in the latest commits
read signed URL
of a data dir and local file path_signed_url
in the Q/A API response to enable UI to reference and provide links to the source files.MultiModalParser
__init__
langchain-openai
from0.1.20
to0.1.25
to fix issues introduced in https://github.com/truefoundry/cognita/pull/371openai
andorjson