Open Qub3k opened 5 years ago
I should add that this example sheet is also not really tidy since it uses one column per subject. But it's a start! (My imagination would be that any output CSV when doing a conversion from suJSON → CSV would at least be tidy. But that's another issue.)
Note that I have already provided a parser for this kind of data format fur SUREAL: https://github.com/Netflix/sureal/blob/master/resource/util/mos_to_sureal.py
You may copy some code from there.
I added a link to the working document, which is going to be the main deliverable of this issue. Please feel free to add your comments.
I added some suggestions to the working document. In principle there is no difference between importing XLS and CSV other than the import function used with pandas
, and that for XLS, you may have additional options to set, like worksheet names etc.
We are never going to be able to parse all possible spreadsheets there are. Thus, we would like to define rules stemming from existing good practices (see Tidy Data by H. Wickham). Following those rules should guarantee that our tools are able to parse your spreadsheet.
As a starting point. @slhck has given an exemplary spreadsheet. It shows what guidelines the ITU P.1203 team is using.
There is also a working document which is going to be the main deliverable of this issue. If you have any comments, please feel free to put them directly in the document or leave them here.