empiricaly / empirica

Open source project to tackle the problem of long development cycles required to produce software to conduct multi-participant and real-time human experiments online.
https://empirica.ly/
Apache License 2.0
44 stars 8 forks source link

Add exporting entire state history to csv (not just final values) #569

Open WhiteJP opened 1 month ago

WhiteJP commented 1 month ago

Is there an existing issue for this?

Is your feature request related to a problem?

It is related to some of the problems (which have now been fixed) about the length of fields (https://github.com/empiricaly/empirica/issues/565) in the export.

Describe the solution you'd like

Maybe an option in empirica export, which allows you to export from the tajriba.json the value of attributes over time. This would result in a csv wherein each row shows the state of the attribute at time t.

This definitely isn't anything urgent, as there are reasonable walkarounds, but would be a cool feature :)

Describe alternatives you've considered

  1. Append updates together into one attribute. This works, but then the work still needs to be done to parse the attribute to reconstruct the state at a timepoint.

  2. Set an attribute each time state changes, with different keys, e..g., "state1", "state2".

Teachability, Documentation, Adoption, Migration Strategy

No response

Code of Conduct

npaton commented 1 month ago

I've been wanting to do this, but I'm not sure how. Here are some example outputs.

Repeat all attributes if 1 attribute change. Noisy.

id,attribute1,attribute1LastChangedAt,attribute2,attribute2LastChangedAt
01J2VWZ4ASWXBXRHF6QMYXZ6CY,foo,2020-01-01T00:00:00Z,bar,2020-01-01T00:00:00Z
01J2VWZ4ASWXBXRHF6QMYXZ6CY,baz,2020-01-02T00:00:00Z,bar,2020-01-01T00:00:00Z
01J2VWZ4ASWXBXRHF6QMYXZ6CY,baz,2020-01-02T00:00:00Z,bam,2020-01-03T00:00:00Z
id attribute1 attribute1LastChangedAt attribute2 attribute2LastChangedAt
01J2VWZ4ASWXBXRHF6QMYXZ6CY foo 2020-01-01T00:00:00Z bar 2020-01-01T00:00:00Z
01J2VWZ4ASWXBXRHF6QMYXZ6CY baz 2020-01-02T00:00:00Z bar 2020-01-01T00:00:00Z
01J2VWZ4ASWXBXRHF6QMYXZ6CY baz 2020-01-02T00:00:00Z bam 2020-01-03T00:00:00Z

Only repeat the attribute that changed.

id,attribute1,attribute1LastChangedAt,attribute2,attribute2LastChangedAt
01J2VWZ4ASWXBXRHF6QMYXZ6CY,foo,2020-01-01T00:00:00Z,bar,2020-01-01T00:00:00Z
01J2VWZ4ASWXBXRHF6QMYXZ6CY,baz,2020-01-02T00:00:00Z,,
01J2VWZ4ASWXBXRHF6QMYXZ6CY,,,bam,2020-01-03T00:00:00Z
id attribute1 attribute1LastChangedAt attribute2 attribute2LastChangedAt
01J2VWZ4ASWXBXRHF6QMYXZ6CY foo 2020-01-01T00:00:00Z bar 2020-01-01T00:00:00Z
01J2VWZ4ASWXBXRHF6QMYXZ6CY baz 2020-01-02T00:00:00Z
01J2VWZ4ASWXBXRHF6QMYXZ6CY bam 2020-01-03T00:00:00Z

Flatten the data. It's a little less noisy. Still easy to parse.

id,name,value,lastChangedAt
01J2VWZ4ASWXBXRHF6QMYXZ6CY,attribute1,foo,2020-01-01T00:00:00Z
01J2VWZ4ASWXBXRHF6QMYXZ6CY,attribute2,bar,2020-01-01T00:00:00Z
01J2VWZ4ASWXBXRHF6QMYXZ6CY,attribute1,baz,2020-01-02T00:00:00Z
01J2VWZ4ASWXBXRHF6QMYXZ6CY,attribute2,bam,2020-01-03T00:00:00Z
id name value lastChangedAt
01J2VWZ4ASWXBXRHF6QMYXZ6CY attribute1 foo 2020-01-01T00:00:00Z
01J2VWZ4ASWXBXRHF6QMYXZ6CY attribute2 bar 2020-01-01T00:00:00Z
01J2VWZ4ASWXBXRHF6QMYXZ6CY attribute1 baz 2020-01-02T00:00:00Z
01J2VWZ4ASWXBXRHF6QMYXZ6CY attribute2 bam 2020-01-03T00:00:00Z

With the flatten method, we would export the same current final state files, and we would have a new file for each type with all the changes. I think this might be the best compromise.

I'm open to suggestions!