Closed ypriverol closed 3 years ago
On one hand this sure sounds redundant, having the same string repeated for all rows and basically representing something that doesn't belong in any row, as this is metadata but it is mixed with data here (or should I say, meta-metadata mixed with metadata?).
On the other hand, I do not see any good alternatives for adding this information within the same file. Perhaps a clean solution would be to optionally provide a separate small file with metadata. We could make it any format, probably a simple key: value
would suffice, and fix its name (something like meta.txt
or whatever), and for now it would only store the SDRF version (although we can discuss, maybe we could use it for other things, too?)
My personal opinion is that if we want everybody to generate the SDRF themselves for submission (MS users with no informatics background included), we need to keep it as easy as possible for them: so a new column (even if this is redundant) sounds good to me.
The other format would be the IDF, which is part of the MAGETAB and contains general metadata (key:value) and the corresponding SDRF link and also additional information such as protocols, versions, authors etc. https://github.com/ebi-gene-expression-group/sc-metadata-fields/blob/master/IDF_template.txt
Personal opinion: If just add version information, a new column is enough even if it may be redundant. But characteristic
and comment
should be avoided. sdrf[version]
?
Because this attribute belongs neither to the sample nor to the proteomics data related information. If want to add more information, a separate small file with metadata is optional
@anjaf do you know which prefix we can use for the SDRF version?
I like @ypriverol's idea of making it a comment in the IDF, e.g. Comment[SDRF version]
. This is how we usually include custom study-level annotations.
@levitsky @mlocardpaulet @anjaf :
I was discussing with @anjaf today, and we find out that SDRF accepts comments around the file using #
:
Blank lines containing zero or more spaces or tabs are permitted in any of these files. Lines starting with the “#” symbol are interpreted as comments.
What do you think about adding a general header to each file containing SDRF version, and other future metadata.
@timosachsenberg @mvaudel ?
I think it's a practical solution. We will have to update the parser(s) so that it works on files with comments, but it's not that hard.
We will move this information to the IDF, please read PR #505
We need to control the version of the file format in each SDRF. If the standard develops over time (as expected) would be great to control what is the version for a specific file.
My proposal is to have a column comment[sdrf-p version] that controls the version of the file format.
Comments @levitsky @mvaudel @mlocardpaulet @all