bigbio / proteomics-sample-metadata

The Proteomics sample metadata: Standard for experimental design annotation in proteomics datasets
GNU General Public License v2.0
78 stars 108 forks source link

Group fractions in PXD000561 #77

Closed timosachsenberg closed 4 years ago

timosachsenberg commented 4 years ago

https://github.com/bigbio/proteomics-metadata-standard/blob/master/annotated-projects/PXD000561/sdrf.tsv

@ypriverol should the sample id be the same for all of its fractions

ypriverol commented 4 years ago

The sample id is an identifier used by the annotator to differentiate samples, it is not the same for all fractions.

timosachsenberg commented 4 years ago

so how do you find out which fractions belong together in this example?

timosachsenberg commented 4 years ago

and how should these be different samples if they are just different fractions? Shouldn't this be the same sample measured in several fractions?

ypriverol commented 4 years ago

so how do you find out which fractions belong together in this example?

This is something we need to model with another variable. The sample id is a unique identifier for each ROW.

and how should these be different samples if they are just different fractions? Shouldn't this be the same sample measured in several fractions?

The sample here is more the combination of the fraction - technical replicate - biological replicate

timosachsenberg commented 4 years ago

ok, I see. Thanks for clarifying. Would it make sense to use a different name for the row identifier (instead of "sample X")? For example, one could call it "quant id" as it represents one quantitative value that gets measured (i.e. it works for label-free as well as for a channel in a multiplexed experiment).

ypriverol commented 4 years ago

We can do that but this will be up to the user. I think we need to finally introduce the concept about how we organize the fractions that belong to the same Biological Sample

timosachsenberg commented 4 years ago

as well as technical replicates and "mixtures" the concept in MSstatsTMT that allows to model if samples are measured in different channels

ypriverol commented 4 years ago

This has been included in recent documentation.