pepkit / eido

Validator for PEP objects
http://eido.databio.org
BSD 2-Clause "Simplified" License
4 stars 6 forks source link

CSV filter output for multi-value sample attributes #34

Closed nsheff closed 1 year ago

nsheff commented 2 years ago

If you have a sample with an attribute with multiple values, the CSV writer will write them into a CSV in a python list form:

eido convert https://raw.githubusercontent.com/pepkit/nf-core-pep/master/samplesheet_test.csv --st-index sample -f csv

Result:

Found 2 samples with non-unique names: {'WT_REP1', 'RAP1_UNINDUCED_REP2'}. Attempting to auto-merge.
Running plugin csv
sample,sample,fastq_1,fastq_2,strandedness
WT_REP2,WT_REP2,https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/testdata/GSE110004/SRR6357072_1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/testdata/GSE110004/SRR6357072_2.fastq.gz,reverse
RAP1_UNINDUCED_REP1,RAP1_UNINDUCED_REP1,https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/testdata/GSE110004/SRR6357073_1.fastq.gz,,reverse
RAP1_IAA_30M_REP1,RAP1_IAA_30M_REP1,https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/testdata/GSE110004/SRR6357076_1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/testdata/GSE110004/SRR6357076_2.fastq.gz,reverse
WT_REP1,WT_REP1,"['https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/testdata/GSE110004/SRR6357070_1.fastq.gz', 'https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/testdata/GSE110004/SRR6357071_1.fastq.gz']","['https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/testdata/GSE110004/SRR6357070_2.fastq.gz', 'https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/testdata/GSE110004/SRR6357071_2.fastq.gz']",reverse
RAP1_UNINDUCED_REP2,RAP1_UNINDUCED_REP2,"['https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/testdata/GSE110004/SRR6357074_1.fastq.gz', 'https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq/testdata/GSE110004/SRR6357075_1.fastq.gz']",,reverse

Might make more sense to do this in the way of multiple rows per sample, for the purposes of the CSV filter :vomiting_face:

nsheff commented 1 year ago

I believe this was fixed in https://github.com/pepkit/eido/pull/43