output tables from analysis_quickstart()

brookegenovese commented 2 months ago

Hi there, I'm looking for some clarification re: the output tables from running analysis_quickstart()that appear in the designated msdap_results timestamped folder. Apologies if this information is detailed somewhere in the vignettes and I've missed it!

Could you clarify how are the six .tsv files are different with respect to the 1) entries (peptides or proteins) included and 2) the calculated values for each protein within each sample? I sometimes use the output from msdap without running any statistically contrasts and want to ensure I'm using the correct datatable in my downstream bioinformatic analyses.

I'm assuming the as-is.tsv files contain the data as it is imported from whatever software you're using (in my case, Spectronaut) prior to using filter_dataset(), and as such would be prior to normalization/filtering...

Here are the .tsv's of interest, for reference. Thanks!

peptide_abundancefilter by group independently.tsv peptide_abundance__global data filter.tsv peptide_abundanceinput data as-is.tsv

protein_abundancefilter by group independently.tsv protein_abundance__global data filter.tsv protein_abundanceinput data as-is.tsv

ftwkoopmans commented 2 months ago

The protein-level output data are the same as their counterpart peptide-level tables (i.e. same filename), with the only difference that the MaxLFQ algorithm was applied to "rollup" from peptide- to protein-level.

There are 4 possible output table variants, with different filtering/normalization applied to them:

input data as-is = No filtering or normalization is applied to the input peptide-level dataset; this is the data from DIA-NN/Spectronaut/MaxQuant "as-is"
filter by group independently = The user-specified filtering rules (parameters for analysis_quickstart()) have been applied to each group independently. So if you set filter_min_detect=3, then within each "sample group" (as defined in your sample metadata table, column "group") MS-DAP will find the set of peptides that were detected in at least 3 samples that belong to the respective sample group. For all peptides that fail (i.e. peptides that did not pass in the previous step), the respective values within this sample group are removed. Finally, the normalization algorithm(s) that you selected are applied to the peptide*sample matrix.
global data filter = The user-specified filtering rules (parameters for analysis_quickstart()) were applied to all groups, i.e. only peptides are retained that meet the filtering criteria in every "sample group" (as defined in your sample metadata table, column "group"). Finally, the normalization algorithm(s) that you selected are applied to the peptide*sample matrix.
filter by contrast = These files are only available when filter_by_contrast=TRUE was set in analysis_quickstart() and any statistical contrasts were defined (typically with setup_contrasts()). The filename will refer to 1 of these user-defined contrasts. Only peptides that meet the filtering criteria in both the sets-of-samples (i.e. whatever samples are defined to be on each side of the contrast) are retained. Finally, the normalization algorithm(s) that you selected are applied to the peptide*sample matrix.

brookegenovese commented 1 month ago

This is what I needed - thanks!

ftwkoopmans / msdap

output tables from analysis_quickstart() #40