Generalize dropdown choice template view

michael-franke commented 5 years ago

As far as I can see the babeViews.dropdownChoice only allows for two alternatives, specified like this:

{
    question: "What's the weather like?",
    option1: 'shiny',
    option2: 'rainbow'
}

This should ideally be generalized to allow for arbitrary options, e.g., to allow input like this:

{
    question: "What's the weather like?",
    options: ['shiny', 'rainbow', 'cloudy', 'typically Amsterdam']
}

We could then add an optional field shuffle_options which defaults to false and which if true randomly shuffles the choice options for display in the dropdown menu.

JannisBush commented 5 years ago

This would be possible, but do we want to save the options as option1: 'shiny', option2: 'rainbow', optionN: 'whatever' or do we want to save the options as an array.

The second one could be problematic, and we currently do not "allow" such submissions at the back-end see here.

The first one could be implemented like this

x-ji commented 5 years ago

The reasoning for not allowing arrays in submissions is because of CSV formatting, i.e. how should we format an array in the CSV output so that it's easy to comprehend and also easy to analyze by (R) scripts.

One possible way is to output it like entry1||entry2||entry 3, i.e. use || as the separator, and then the script can segment the array by looking for ||. However, if we do this, we'll have to make sure that || doesn't occur in any of the array entries, otherwise errors would ensue.

Another way is to just always require the frontend to not submit arrays and instead submit each option separately, as Jannis mentioned above.

JannisBush commented 5 years ago

I just tested, what actually happens when you use an array at the moment.

For this, I inserted test_array : [1,2,3,"a","b] in the trial data of an experiment.

Result downloaded from the last page of the debug mode:

"trial_type","trial_number","key_pressed","correctness","pause","RT","key1","key2","q","p","target_object","target_position","condition","focalColor","focalShape","focalNumber","elemSize","total","start_with","otherShape","otherColor","sort","test_array","startDate","startTime","age","gender","education","languages","comments","endTime","timeSpent","experiment_id",
"main","1","q","correct","2185","330","q","p","circle","square","circle","right","incongruent","blue","circle","1","100","2","other","square","white","split_grid","1,2,3,a,b","Sun Apr 14 2019 09:59:33 GMT+0200 (Central European Summer Time)","1555228773704","","","","","","1555228786455","0.21251666666666666","INSERT_A_NUMBER",

Important part: "test_array" = "1,2,3,a,b"

Result downloaded from the server app:

submission_id,otherShape,startDate,target_object,correctness,total,experiment_id,trial_type,q,focalShape,key2,target_position,p,test_array,pause,key_pressed,condition,endTime,otherColor,age,comments,sort,key1,education,languages,start_with,RT,gender,trial_number,startTime,elemSize,focalColor,timeSpent,focalNumber
321,circle,Sun Apr 14 2019 10:01:35 GMT+0200 (Central European Summer Time),square,correct,2,63,practice,square,square,p,left,circle,1|2|3|a|b,1535,q,congruent,1555228907222,white,,,split_grid,q,,,focal,330,,1,1555228895329,100,blue,0.19821666666666668,1

Important part: test_array = 1|2|3|a|b

R: Reading in the debug result with readr::read_csv generates test_array : chr "1,2,3,a,b" and transforming it with strsplit(dat, split=",") gives : chr [1:5] "1" "2" "3" "a" "b".

Reading in the server result with readr::read_csv generates test_array : chr "1|2|3|a|b" and transforming it with strsplit(dat, split="[|]") gives : chr [1:5] "1" "2" "3" "a" "b"

Now, I changed the array to test_array : [1,2,3,"a","b","c|ha,c"].

Debug: "1,2,3,a,b,c|ha,c"

Server: 1|2|3|a|b|c|ha,c

Both will fail to produce the original array.

I guess, we could allow arrays and write a warning somewhere not to use , and | in arrays?

michael-franke commented 5 years ago

Thanks for the info and the testing! We had and probably still have array output in SPR-task reaction times. I like that the backend /can/ handle arrays. We should keep this as it is.

We could use, per default the separator “|”. If we want to be fancy, we could implement a system that should in-principle always work: the array-separator is n+1 times “|” for n the maximal number of uninterrupted occurrences of “|” (e.g., for “a big|||brown fox, ...” it would be n=3) in the array-as-string representations of all arrays in the column. It will be a bit of fiddling to get the right n, but this should in principle be possible (even if potentially ugly).

It seems that the behavior of the download-CSV from debug is not quite right, however. The only way to get that right would be to do the massaging in the front-end. - Ideally, the CSV-download from debug mode would always look exactly like the CSV download from the backend. (But this is not very, very important.)

Still, when it comes to the question as to how to represent choice options in generalized versions of forced-choice tasks etc., I think that it is much more user-friendly and time-efficient to massage the data into the “wide format” within _babe. Otherwise, the user (and we) will be doing a lot of the same data wrangling steps during early data analysis. Seen in this way, since there is likely hardly ever a situation where the user prefers a representation like "options: 'shiny|rainbow|whatever'” it would be prudent to do the massaging within _babe.

On 13. Apr 2019, at 17:15, JannisBush notifications@github.com wrote:

This would be possible, but do we want to save the options as `option1: 'shiny', option2: 'rainbow', optionN: 'whatever'`` or do we want to save the options as an array.

The second one could be problematic, and we currently do not "allow" such submissions at the back-end see here.

The first one could be implemented like this

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

x-ji commented 5 years ago

Yeah I think the problem with having a warning not to use , or | is that, some of the outputs are not generated by the experiment designer, but rather the participant. What if they write several sentences containing ,, for example. The system would then not be foolproof.

The idea of using n + 1 times separator is interesting. I'll try to implement that on the backend.

JannisBush commented 5 years ago

The places where the participant can input sentences, should be handled as Strings and not as Arrays anyways. Therefore, there shouldn't be a problem, because we only touch Arrays or am I mistaken?

x-ji commented 5 years ago

There were cases in previous experiments where there are e.g. three sentences to be filled in on the same page, resulting in the response being recorded in three parts. Of course if there's only one sentence there will only be one string. Not sure if such cases are still possible in the new frontend templates.

x-ji commented 5 years ago

I think currently what wouldn't work is when an entry is a JS object, i.e. k-v pairs. This is what happened with the error that Jannis encountered a while ago. If an entry is just a plain array, the contents of the array would be output in the CSV file, separated with |. I have clarified it in the README.

michael-franke commented 5 years ago

We do have the case where ‘canvas’ is part of the data passed to a view, and that is flattened out and fully reproduced in the output data (I believe). Given that we use canvas info in this way, it may be that users want to pass objects to their custom views. To prepare for that case, we could flatten and output also objects as values of data to be stored in the backend. The basics for doing this should already be there.

On 19. Apr 2019, at 22:46, Xiang Ji notifications@github.com wrote:

Yeah I think currently what wouldn't work is when an entry is a JSON object, i.e. k-v pairs. If an entry is just a plain array, the contents of the array would be output in the CSV file, separated with |.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

x-ji commented 5 years ago

Indeed the submission of the canvas data as an object was what triggered Jannis to encounter the bug in the first place. We discussed the issue here: https://github.com/babe-project/BABE/issues/72#issuecomment-467968099 and Jannis flattened the object in the frontend.

Should we allow the user to include objects in their submission but flatten the object? Then should we do it in the frontend, as a check before the submission, or should the backend try to do it?

michael-franke commented 5 years ago

Ah, I see. I have no opinion on where to do this, front or back. If we already flatten the canvas-info in the front-end, it might be handy to reuse this and do it all in the front end.

On 21. Apr 2019, at 21:56, Xiang Ji notifications@github.com wrote:

Indeed the submission of the canvas data as an object was what triggered Jannis to encounter the bug in the first place. We discussed the issue here: babe-project/BABE#72 (comment) and Jannis flattened the object in the frontend.

Should we allow the user to include objects in their submission but flatten the object? Then should we do it in the frontend, as a check before the submission, or should the backend try to do it?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

magpie-ea / magpie-modules

Generalize dropdown choice template view #54