Closed jstone-dev closed 4 months ago
If we want to continue to allow comments in imported CSV files, we should clearly document the file format. This would be helpful anyway, but it's really necessary since comments aren't part of any CSV standard, though some parsers support them.
Seems to me like we don't need comment support in scores and counts files, but maybe @afrubin has a better sense of how useful it is. Removing the #
reserved character from the csv parser seems like the simplest solution here to me, unless there is some utility I am missing.
We used to use the comments in the exported files as a header to include the license information, access date, and other details about the score set record. I think this was useful and was much more important back when we had a diverse mix of data licenses rather than most things being CC0. I'm fine with removing it. We can always re-add it later if compelling use cases emerge.
The CSV formats for exporting and importing are inconsistent. In exported files, the accession column contains unquoted # characters. During import, comments beginning with # are allowed, so it looks like lines do not have values in any other columns. Validation fails at the step that checks for the presence of at least one HGVS column.