nextflow-io / nf-schema

Functionality for working with pipeline and sample sheet schema files in Nextflow pipelines
https://nextflow-io.github.io/nf-schema/
Apache License 2.0
9 stars 2 forks source link

Strict validation of input samplesheet.csv #7

Open jannikseidelQBiC opened 4 months ago

jannikseidelQBiC commented 4 months ago

Following feature would be beneficial in the future:

The same feature would be nice for entries in the samplesheet, where required fields are missing.

e.g. if following samplesheet would be present

sample,short_reads_fastq_1,short_reads_fastq_2,long_reads_fastq_2
test,,path/to/first_fastq.gz,

and should be formatted like this

sample,short_reads_fastq_1,short_reads_fastq_2,long_reads_fastq_1
test,path/to/first_fastq.gz,,

and either short_reads_fastq_1 or long_reads_fastq_1are required fields

The pipeline should fail and not run empty with warnings. Also a meaningful error should be thrown in such a scenario.

It would be best this could be turned on or off in a schema_input.json.

nvnieuwk commented 3 months ago

The best solution for this is to create a new configuration option (validation.strictHeaders for example). This can then be turned on to throw errors instead of warnings for unknown headers.