nf-core / bacass

Simple bacterial assembly and annotation pipeline
https://nf-co.re/bacass
MIT License
62 stars 42 forks source link

Feature request: allow directory paths for LongFastQ input #151

Open watsonar opened 4 months ago

watsonar commented 4 months ago

Description of feature

Many labs doing ONT sequencing generate basecalled/demultiplexed FASTQ output split into multiple files in the same directory (this is in fact the default behavior of MinKNOW software). Would it be possible to add a feature where a path to a directory can be provided for the LongFastQ input, and all fastq files within that directory concatenated and used as workflow input? I'd be happy to work on implementing this if it fits with the developers' vision.

Thanks for your consideration! Andrea

d4straub commented 4 months ago

Hi there, I do not oppose the addition but I do want to share my experience: Folder structures are changing over time and deviate between labs, therefore, code that parses your folder structure wont fit that of many others. Some output comes e.g. with failed and passed fastq files, so it seems not feasible in that cases to just concatenate all reads.

Daniel-VM commented 4 months ago

Hi @watsonar! Thanks for your suggestion. I think the idea is great but, as @d4straub mentioned, it comes with some challenges that might place it out of the scope of the current version of nf-core/bacass 🤔 .