d-isasterhub / IDP-simulated-user-exp

0 stars 0 forks source link

split bird species and agreement results #42

Closed d-isasterhub closed 9 months ago

d-isasterhub commented 10 months ago

Reason

Currently, we have 3 manners in which we generate the answers for the bird species classification. For agreement, we will also have to try multiple prompting strategies.

Related

The files can be split as suggested here:

35

However, I am not entirely happy with this solution, as it is only possible to see what the settings were for the individual classification/agreement questions through a README and it could become quite confusing.

bowl-of-rais commented 10 months ago

I have the following proposal:

This could look something like:

v1_profile_no_reason/
    - agreement/
        - blind_protocol.txt                    # blind is supposed to mean "without knowing the number of correct classifications"
        - blind_results.csv
        - with_correct_protocol.txt
        - with_correct_results.csv
    - birds/
        - protocol.txt
        - results.csv
v2_profile_reason_deprecated/
    - ...
v3_profile_reason/
    - ...
v4_heatmap_first/
    - ... 

I tried to keep the file names short to make it less cluttered. We can also make longer names that reiterate the info in the folder titles, e.g.:

v1_profile_no_reason/
    - agreement/
        - v1_agree_blind_protocol.txt                    # blind is supposed to mean "without knowing the number of correct classifications"
        - v1_agree_blind_results.csv
        - v1_agree_with_correct_protocol.txt
        - v1_agree_with_correct_results.csv
    - birds/
        - v1_birds_protocol.txt
        - v1_birds_results.csv
v2_profile_reason_deprecated/
    - ...
v3_profile_reason/
    - ...
v4_heatmap_first/
    - ... 
d-isasterhub commented 10 months ago

I like the structure. 2 Thoughts:

  1. We should rename v1/v2_interview then, so it does not become confusing
  2. Maybe we can use blind_protocol.txt -> no_accuracy_protocol.txt and with_correct_protocol.txt -> with_accuracy_protocol.txt instead (even if it is not exactly accuracy but "x out of 20")
d-isasterhub commented 10 months ago

1: profiling + no reasoning 2: no profiling + no reasoning 3: profiling + reasoning 4: no profiling + reasoning 5: profiling + heatmap 6: no profiling + heatmap

bowl-of-rais commented 10 months ago
  1. We should rename v1/v2_interview then, so it does not become confusing

True. Maybe let's do old_interview and interview?

  1. Maybe we can use blind_protocol.txt -> no_accuracy_protocol.txt and with_correct_protocol.txt -> with_accuracy_protocol.txt instead (even if it is not exactly accuracy but "x out of 20")

Thought about that too but wasn't sure :'D but if you think so too, then sure :>

bowl-of-rais commented 10 months ago

Update bc agreement is independent:

v1_profile_no_reason/
    - protocol.txt
    - results.csv
v2_no_profile_no_reason/
    - ...
v3_profile_reason/
    - ...
v4_no_profile_reason/
    - ... 
v5_profile_heatmap/
    - ...
v6_no_profile_heatmap/
    - ...
agreement/
    - with_accuracy_protocol.txt
    - with_accuracy_results.csv
    - without_accuracy_protocol.txt
    - without_accuracy_results.csv
d-isasterhub commented 10 months ago

The Auklets approve

bowl-of-rais commented 10 months ago

Update:

no_reason/
    - profile/
        - protocol.txt
        - results.csv
    - no_profile/
        - protocol.txt
        - results.csv 
reason_profile_first/
    - ...
reason_heatmap_first/
    - ...
agreement/
    - with_accuracy_protocol.txt
    - with_accuracy_results.csv
    - without_accuracy_protocol.txt
    - without_accuracy_results.csv

I would adapt the command line arguments to just have a --profiling and a --reason option instead of variation numbers