Open veitveit opened 7 months ago
For getting started with the convertors, this is my suggestion for the standardized output format, meant for input for the statistical tests.
Experimental design
The experimental design file will already exist and is of the format given in the README of https://github.com/wombat-p/WOMBAT-Pipelines
The column "exp_condition" in this file is crucial as defining the columns names in the standardized format.
Sample nomenclature:
For each of the files, fractions will be summarized in to "samples". Then this will provided by the name in "expcondition" from the experimental design file plus "" and the number of the biological/technical replicate. "INFOTYPE_EXPCOND_BIOREP". For example: "number_of_peptides_100.amol_3"
Protein level file stand_prot_quant.csv The file should contain the following columns:
Peptide level file stand_pep_quant.csv
The file should contain the following columns:
Ion level file stand_ion_quant.csv (optional and more for being able to send the output to ProteoBench):
Same as peptide level file, but with charge states separated to represent the peaks in the chromatogram
examples_PXD011153.zip And here are the example files for FlashLFQ and the TPP output generated with my own scripts. The TPP output seems to be mostly complete although missing a good way to deal with the modifications.
@wraff Sorry, I think we need a small correction for the column names of e.g. "abundance_", as to include the technical replicates: "INFOTYPE_EXPCOND_BIOREP_TECHREP".
Description of feature
Instead of attached on specific statistical test to a workflow, let them run on any the generalized stand_pep/stand_prot files, then providing an updated version of these files containing the p-values and FDRs, as well as the log-ratios
Questions:
Current state:
https://github.com/wombat-p/WOMBAT-Pipelines/tree/mutli_stat_tests
Rough working plan
@veitveit @wraff