bigbio / sdrf-pipelines

A repository to convert SDRF proteomics files into pipelines config files
Apache License 2.0
16 stars 22 forks source link

[ENH] Need for more comprehensive integration testing #112

Open fabianegli opened 2 years ago

fabianegli commented 2 years ago

@ypriverol Do you have a set of SDRF files for experiments of all different kinds of expected flavors?

If not we should systematically generate SDRF example files with all the different labelling techniques, fractionations, biological and technical replicates, file names (spaces/weird characters, if allowable) to have test cases for at least the most common experimental setups. Based on those we can then generate examples of SDRF files that do not comply with the standard, but should be readable nonetheless and some that are wrong in various ways we expect users to mess up writing SDRF files with common tools or even by hand.

fabianegli commented 2 years ago
Click to see the contents of the deleted `validate_sdrf.sh` file. #!/bin/bash #pip install sdrf-pipelines parse_sdrf convert-maxquant -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD000612/sdrf.tsv" -o1 param.xml -o2 design.txt parse_sdrf convert-openms -t2 -l -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD018241/sdrf-phosphoproteomics.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD009602/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD011799/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD000612/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD001819/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD001819/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD002049/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD002088/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD003133/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD004452/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD005940/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD005942/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD005946/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD006401/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD006675/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD006914/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD008840/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD009602/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD010154/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD012203/sdrf.tsv" ##parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD017710/sdrf.tsv" SILAC/TMT #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD018117/sdrf.tsv" #parse_sdrf convert-openms -s "https://raw.githubusercontent.com/bigbio/proteomics-metadata-standard/master/annotated-projects/PXD011799/sdrf.tsv"
ypriverol commented 2 years ago

@fabianegli thanks for pointing out this. This is the most important thing missing in the repo and small task that can be tackle. We should have a collection in the tests for CI/CD. As I mentioned to you before, most of the datasets already annotated are datasets has been post annotated by the team, then some metadata is missing. However, we have good examples already @qinchunyuan than we can start adding here in this issue by topic.

daichengxin commented 2 years ago

Good idea @fabianegli. There are some examples covering different types.

fabianegli commented 2 years ago

Just linking the tests from in jsdrf as reference.