Somethings about llama-index evaluation

FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs

MIT License

7.61k stars 553 forks source link

Open aagq opened 6 months ago

aagq commented 6 months ago

Could you tell me which dataset in your llama-index evaluation or how to do this, thanks

545999961 commented 6 months ago

The dataset is the paper 《Llama 2: Open Foundation and Fine-Tuned Chat Models》.
If you want to evaluate, you need to modify the "SentenceTransformerRerank" class in llama_index, you can use "LLMReranker" class in FlagEmbedding to modify it.

aagq commented 6 months ago

Another question: Do you use Anthropic LLM to generate Question-Context Pairs?

545999961 commented 6 months ago

We don't use Anthropic LLM to generate Question-Context Pairs.

aagq commented 6 months ago

So, how do you generate Question-Context Pairs?

545999961 commented 6 months ago

We use GPT-3.5 to generate Question-Context Pairs.

aagq commented 6 months ago

How many Question-Context pairs did you generate?

545999961 commented 6 months ago

Following llama_index, we use Pages from start to 36 for the experiment which excludes table of contents, references, and appendix.