Closed levitsky closed 4 years ago
Agree, I think this is a legacy property from Transcriptomics experiments. @anjaf can you let us know why is important?
Right now we have organism part
or cell type
which kind of alternate, which is indeed not ideal. So I see why Material Type
would help, e.g. if it is cell
then you can look for cell type
but if it is tissue
then you can look for organism part
, etc.
However, organism part
and cell type
are mandatory for all annotations anyway, so either one can be checked without any preconditions.
@ypriverol I expanded the issue to include description
as well, another column used but not described.
I will take a look. @levitsky do you have more annotated projects coming soon? Would be nice to check these list https://github.com/bigbio/proteomics-metadata-standard/issues/271
We're working through our list of annotated projects. What I can say is that we focused on live human samples, not cell lines, so probably no intersections with that list.
Yes, "Material Type" and "Description" come from the original MAGE-TAB specifications, and relate to the Source Name column. So Material Type basically means "source material type": what material was used as input at the start of the experiment? The controlled vocabulary is "whole organism", "organism part", "cell", "DNA", "RNA". DNA and RNA are obviously not applicable for proteomics experiments, so yes that would leave you the other three. We also tend to avoid the "Description" field (as it is not very specific), in favour of the specific Characteristics fields. (We have "description" field in Annotare for inexperienced submitters to use if they can't find any other options and curators then put the information from this field under the suitable Characteristics terms.)
@levitsky I suggest that in order to be compatible with transcriptomics we allow the following column names:
As additional properties, we don't validate anything from them, it is up to the user to provide them.
I don't have the context knowledge to opine on the necessity of keeping compatibility.
Abstractly speaking, allowing "description" will probably tempt annotation authors to fill it with information that belongs in another column, or multiple columns, reducing the utility of annotation. To avoid this, any use of description
should be actively discouraged, if allowed.
Let's remove it Description.
If it is decided, we can close this and merge https://github.com/bigbio/sdrf-pipelines/pull/36.
I see
Material Type
in many annotations but it is not specified anywhere in the standard description. Is it allowed, and if yes, why is it needed?Material type sounds like a sample characteristic. If there is not a suitable term in EFO, perhaps one can be added? Alternatively, the use of
Material Type
needs to be described in the specification.Upd: Another column that I see is "description". I think it is an attempt to describe the data set as a whole but it is repeated for every row. I don't think it belongs in SDRF.