Open kirk-marple opened 1 year ago
Thanks for the feedback, we are actively working on this feature but have no ETA at this time.
Excellent. We are also interested in the same.
This seems like a really key problem to solve - how many real world documents are small enough that you would be able to include them directly in a prompt? Carving up a document (on ingestion or indexing) such that it is possible to find and retrieve just the relevant portions in a prompt seems like a blocking requirement. Wonder how others are solving this problem with Cognitive search?
Related to this, found these two fields in an index created by the Custom Answering service. My plan was to use embeddings and build a similar QnA service as suggested in a lot of Microsoft slides, but seeing those, I'm wondering if the pattern is being implemented on the Custom Answering service.
Any news? Seems like a quite simple use case
For our use case, we are ingesting long documents and audio transcripts. The amount of text we're starting with exceeds the 8K limit of the Ada embedding model.
So we need to create multiple embeddings from each piece of content.
Since we can only store one vector per search document, I had to come up with a hacky solution to store 'n' search documents per content. (Basically one parent search document, and 'n' child search documents, n == # of chunks).
If the Cog Search index could support a collection of complex types, each which included a vector, it would make this scenario much cleaner for these use cases.
Currently, it errors with "Only a top-level field of the index can be a vector field."