Gapminder / gapminder-ai

0 stars 0 forks source link

Make it possible to mimic how surveys are conducted with humans #13

Closed semio closed 1 year ago

semio commented 1 year ago

the goal:

In this PR, I made following changes:

Prompt Variation

Session Result

Added Session Result df in AiEvalData (and the AI eval spreadsheet) which holds raw results from all sessions. Some notable columns:

(This Dataframe will be large if we have many surveys and model configs, might be good to put it to another place. I suggest that when it become too large, we can export it to a google drive folder and clean the content in Sessions sheet. And in the google drive folder, we name the files session.log.1.csv, session.log.2.csv etc. Just like the log files management in Linux.)

Model Configurations

Helpers and Notebook

semio commented 1 year ago

I wonder if this replaces some of the previous code? We should still have the option to run evaluations without memory, as per the original pilot plans.

yes, some of the previous code in helpers.py are updated, mainly the procedure to send questions to LLMs. It will check if memory is enabled in the model config and add memory to the LLMChain if it's enabled. So yes we can run evaluations with or without memory.

You can toggle memory in the "Model Configurations" sheet of AI Evaluation spreadsheet