CliDyn / climsight

prototype of a system that provide local climate information
BSD 3-Clause "New" or "Revised" License
27 stars 9 forks source link

Evaluation #112

Closed kuivi closed 1 month ago

kuivi commented 1 month ago

It’s time to introduce the evaluation to Climsight. Here are some initial changes:

•   Added verbose prints in the terminal version.
•   Made some small adjustments in the list of packages.

Finally, the evaluation process has been set up. The main file is evaluation.py.

The idea is simple:

1.  Read Q&A from IPCC and a self-made Q&A based on the GERICS report.
2.  Ask Climsight these questions.
3.  Evaluate the answers with LLM by comparing them with the “original” answers.
4.  Parse the evaluation results from LLM.
5.  Save the report to the evaluation/ directory as YAML and TXT files.