Open esurface opened 11 months ago
@esurface - are you certain this is in fact the current behavior? I don't think my experience has reflected this. (The same varname showing up in multiple columns.)
Also wanted to confirm - the illustrative JSON for the second block still says "formVersion": "v1"
, while the illustrative CSV output says formVersion
is v2
. Is that a typo, or are you suggesting that there be some manner of auto-incrementing happening? (I'm assuming the former - I don't think an 'automagical' auto-increment would be the ideal way to go.)
In re: "breaking the order of variables into CSVs" - this is already somewhat broken, in that late-added variables get appended to the end of the CSV column list rather than actually being inserted alongside their neighbors in the instrument proper.
For instance, if I've generated data for an instrument having SectionA.item1-SectionA.item10
, SectionB.item1-SectionB.item10
, and SectionC.item1-SectionC.item10
in that order, and then I add the variable SectionA.item11
, that new variable will wind up as the 31st item in the column list rather than the 11th. (Ignoring all the metadata columns for the purposes of this example.)
If you want to fix that, that would be cool. But the current reality doesn't seem to match what you're describing under bullet 2.
Issue
If a change is made to a form that moves a variable to a different section, then the CSV output will show that variable in two columns. This is not necessarily a bug, but Tangerine should handle this scenario.
CSV outputs are designed to show variables by section so they follow the data dictionary. Changes to the structure of the sections and variables in different versions of the form will change the order of the headers in the CSV outputs.
Example
Form version one has the variable
held
in the sectionSought
Form version two has the variable
held
in the sectionCrime
The CSV output for this form will be:
Considerations
Solutions to the issue will need to consider how to implicitly infer a form version from the csv-reporting metadata
Solutions will also need to consider the impact on the ordering of sections and variables in the outputs
MySQL outputs do not have this issue since duplicate variable are not allowed
Possible Solutions