SuryaKrishna02 / maya-dataset-creation

The Repository contains the code for dataset creation for the Training the Maya: Multilingual Aya Model
MIT License
1 stars 2 forks source link

Evaluation script added #9

Closed pilot-j closed 2 months ago

pilot-j commented 2 months ago

Evaluation Script: Prompt Quality Assessment

Use eval_script.py to evaluate the quality of responses based on custom prompts.

Arguments:

Prompt Class Description

The Prompt class is designed to encapsulate and structure the information needed for generating and evaluating language model prompts. It consists of three key attributes:

Example Usage:

Given a prompt like:

prompt = Prompt(translate_to="Hindi", preamble="Translate", message="Input")

This creates a prompt where the input text "Input" is instructed to be translated to Hindi. The Prompt class structures this data to be used effectively within the evaluation script.

Prompt Format: Each prompt is a list of three strings: translate_to, preamble, and message, wrapped into a Prompt class. Example prompt file content:

[['Hindi', 'Translate', 'Input'], ['Spanish', 'Translate to Spanish', 'Input Text']]

Output: Generates evaluation reports named as:

prompt_<prompt_no>_<language>_eval_report.csv

Example: prompt_0_Hindi_eval_report.csv

Usage:

python3 eval_script.py --base_path ./output_reports --eval_csv ./evaluation_dataset.csv --prompt_file ./prompts.txt
pilot-j commented 2 months ago

Great Work. The following are the minor changes that needs to be done:

  1. Update the requirements.txt file.
  2. Remove unused import statements.
  3. Remove unneccessary arguments.
  4. If possible, write the datatypes of the arguments like other files translation/verification.py file.

Necessary changes done. Please review and let me know.