Closed Gautam-Rajeev closed 8 months ago
hi @GautamR-Samagra , is this still open for the community ? I would like to work on this
Thanks
@AbhishekRP2002 Not open to community yet, I picked this up myself. Will let you know if there is anywhere I need help on from the community
Highlights of trying to improve retrieval
Improve embeddings : Create a setup that for a given set of chunks and question answer pairs, compares amongst embedding retrieval- reranker combinations. Also include openai text embeddings v3 in the comparison
Fine tune embeddings : While we have chunks and questions, these, we create embeddings dataset simply by considering the first chunk from which question is answered to be the correct chunk to be retrieved (scored as 1) and others are score 0. We need to set this up, such that chunks with more subtle differences are created and scored more naturally to create better setup for fine-tuning embeddings We need to setup code for quick fine-tuning and testing capability of the fine-tuned embeddings
Some earlier work done on simpler retrieval testing here