Open msclar opened 1 week ago
Hi Melanie! Thank you for reaching out!
As described in section 4.2 of our work, we labelled the datasets manually 50 random samples from the dataset to retrieve the statefulness values. We made a small applet to facilitate the labelling process. I have just created a quick demo for you! Screencast from 2024-11-15 15-32-20.webm
The applet is located in this directory: https://github.com/Flecart/complexity-tom-dwm/tree/main/statefulness/app.
You should run python3 server.py
and connect to localhost at port 8000 to see the interface you see in the video.
Then, to create a state you should highlight the sentence or part of it. To remove a state you should click the highlighted text.
I strongly suggest to serialize the data for the applet into the Schema described at this line, the applet might not work if the input json doesn't have that format, I have not tested this scenario.
Having prompt, question and answer is all you need to create the labelled data!
I have looked at the parameter we used in our work. We used $\tau$ as 0.2
for every dataset. So that's the suggested parameter choice.
And bash script/gpt-3.5.sh
is the script for the accuracy result for the prompting method we proposed, not for the complexity metric!
For the complexity metric we ran the script in: https://github.com/Flecart/complexity-tom-dwm/blob/main/statefulness/copy_state_data.py.
This will print out the stateful and stateless values for each sample in the data.
Then, in the report we did something similar to the following:
# paste the output result for stateful value
tomi = np.array([1, 1, 1, 4, 3, 1, 5, 5, 1, 3, 3, 1, 5, 4, 4, 1, 1, 4, 4, 1, 2, 6, 1, 2, 1, 1, 3, 5, 3, 1, 5, 6, 4, 1, 1, 5, 3, 5, 1, 1, 1, 5, 1, 1, 1, 3, 1, 3, 4, 3], dtype=float)
# paste the stateless value
tau = 0.2
tomi += tau*np.array([8, 5, 7, 5, 1, 3, 4, 2, 3, 6, 3, 2, 7, 2, 3, 1, 4, 5, 3, 3, 6, 6, 1, 8, 6, 7, 6, 6, 3, 2, 7, 2, 0, 4, 7, 4, 2, 5, 2, 5, 6, 5, 1, 5, 8, 5, 5, 7, 4, 4])
Then, we used boxplots to plot the results.
If you need further assitance, feel free to reach out!
Thank you for the great work and for releasing the code!
If we wanted to compute the complexity for a new dataset, what would be the steps to do so?
I see that
data/<dataset>/splits.json
already has thenum_states
andnum_highlights
. For the dataset I'm interested in, I solely have the prompt, question, & answer. Once I populate this file correctly, what parameter choices would be best to report?Would it be correct to say that after making these modifications,
bash script/gpt-3.5.sh
should yield the results I need or am I missing anything?Thanks in advance!