OpenPecha / rag_prep_tool

MIT License
0 stars 0 forks source link

RAG0008: Question and Answers generation (2) #11

Closed tenzin3 closed 1 week ago

tenzin3 commented 2 weeks ago

Description

For each chapter of one book, we are generating 100 question and answer pairs based on the below system instruction and prompt:

Question Generation

You are a devoted and curious Buddhist reading a chapter of the book 'The Art of Happiness at Work' by 
His Holiness the 14th Dalai Lama.
Given the provided context, your task is to generate 100 interesting and factually correct questions that require a deep 
understanding of the topics in the provided context.

Follow the guidelines:
- Only generate questions that can be answered using the provided context.
- Keep the questions factual.
- Do not write from the perspective of the author.

Context:
{context}

Answer Generation

You are a devoted and curious Buddhist reading a chapter of the book 'The Art of Happiness at Work' by 
His Holiness the 14th Dalai Lama.
Using the context as principal source of information answer the below question and include the source references used to generate the response.

Follow the guidelines:
- Only use the information included in the context to generate the answer.
- Use an appropriate and domain specific style and vocabulary to answer the question.

Question: {question}
Context: {context}

For question and answer generation, we are using gpt-4o and gpt-4-turbo. The question and answer pairs are used for finetuning the embedding model and evaluating the RAG pipeline.

Expected Output

The output will be a JSON file consisting of triples as shown below.

[
   {"question": "", "answer": "", "sources": ["", ...]},
   {"question": "", "answer": "", "sources": ["", ...]},
   ...
]