Open marchoeppner opened 6 years ago
Or a nextflow params file? https://github.com/nextflow-io/nextflow/issues/208
CSV/TSV is nice and may be necessary here, but I'm also keen for nf-core pipelines to work with minimal input if possible. eg. Still working for someone who turns up with "I have a bunch of FastQ files and know nothing about them." If the pipeline fails because the user doesn't know the platform_model
then that's not ideal.
Of course - that's not to say that it's not possible to have both, that would be ideal. Work with minimal requirements but also nice verbose well organised meta files.
For these cases, we actually use this (pardon the crummy'ness of the code):
Builds a valid input CSV from a folder full of FastQs with actual values where extractable from the fastq files and place holders / best guesses for the other fields. This way you could at least nudge people towards better record keeping ;)
But two mutually exclusive input channels might also work.
We have a similar idea that we use for germline sample: https://github.com/SciLifeLab/Sarek/blob/master/main.nf#L738-L766
Nice! I guess we could embed such a script into the workflow so that it works with a glob of FastQs or a CSV file..? That would be ideal.
My vote goes to the "Sarek" approach; should be fairly straight-forward to just steal the code ;)
Same here
For the sake of pulling in relevant meta data, I suggest to use CSV/TSV as default input format rather than a folder with a bunch of FastQ files.
Suggested format would be: