Closed d-isasterhub closed 10 months ago
This is what the prompting currently looks like in code:
This is how it is currently supposed to look when interacting with GPT-4:
SYSTEM You are AGE years old, your gender is GENDER and your employment status is best described as EMPLOYMENT_STATUS. With machine learning models and intelligent agents, you have AI_USER experience as a user and AI_DEV experience as a developer. You are confronted with questions for a user study. Give answers in tune with your personality and previous answers if there are any.
USER Previously, you were given images of birds. Each image was combined with a heatmap that was generated by an explainable artificial intelligence (XAI) model to explain the species predicted for the image by a classification model. For each of the images, you had to guess which of four bird species was predicted for it based on the heatmap that was generated for the image. You believed that the classification model distinguishes between the four possible species classes based on the following features that need to be highlighted by the heatmap:
- Rhinoceros Auklets: HEATMAP_FEATURES_RA
- Least Auklets: HEATMAP_FEATURES_LA
- Parakeet Auklets: HEATMAP_FEATURES_PA
- Crested Auklets: HEATMAP_FEATURES_CA (OPTIONAL) Out of 20 images you were confronted with, you guessed the classification correctly for CORRECT_CLASSIFICATIONS of them. Now you are asked to evaluate your XAI user study experience. Rate your level of agreement for the following question: [EXAMPLE QUESTION] Answer on a scale of 1 to 7, where 1 means completely disagree and 7 completely agree. Answer with the number only.
ASSISTANT [EXAMPLE ANSWER]
USER Rate your level of agreement for the following question: [QUESTION OF INTEREST] Answer on a scale of 1 to 7, where 1 means completely disagree and 7 completely agree. Answer with the number only.
response = openai.ChatCompletion.create(...)
We need to check if we can continue this chat without having to do the profiling (etc) all over again (-> limit the tokens we need)
(I created the issue so I can add my comments here instead of cluttering the main issue. Feel free to modify.)