LucjanJanowski / translator-to-suJSON

Read subjective experiment data to suJSON format.
MIT License
2 stars 1 forks source link

Define rules that spreadsheets must follow #5

Open Qub3k opened 5 years ago

Qub3k commented 5 years ago

We are never going to be able to parse all possible spreadsheets there are. Thus, we would like to define rules stemming from existing good practices (see Tidy Data by H. Wickham). Following those rules should guarantee that our tools are able to parse your spreadsheet.

As a starting point. @slhck has given an exemplary spreadsheet. It shows what guidelines the ITU P.1203 team is using.

There is also a working document which is going to be the main deliverable of this issue. If you have any comments, please feel free to put them directly in the document or leave them here.

slhck commented 5 years ago

I should add that this example sheet is also not really tidy since it uses one column per subject. But it's a start! (My imagination would be that any output CSV when doing a conversion from suJSON → CSV would at least be tidy. But that's another issue.)

Note that I have already provided a parser for this kind of data format fur SUREAL: https://github.com/Netflix/sureal/blob/master/resource/util/mos_to_sureal.py

You may copy some code from there.

Qub3k commented 5 years ago

I added a link to the working document, which is going to be the main deliverable of this issue. Please feel free to add your comments.

slhck commented 5 years ago

I added some suggestions to the working document. In principle there is no difference between importing XLS and CSV other than the import function used with pandas, and that for XLS, you may have additional options to set, like worksheet names etc.