epi2me-labs / wf-clone-validation

Other
24 stars 18 forks source link

Fix for "Error: None of the directories given contain .fastq(.gz) files." #6

Closed drpatelh closed 1 year ago

drpatelh commented 2 years ago

As reported by one of our customers when testing this pipeline on AWS Batch via Nextflow Tower.

image

The input variable is being overwritten here by a file object, and this file object is then passed to the get_subdirectories function which is expecting a string here.

The fix was to re-assign the input variable as done in this PR.

For testing purposes, I used a random dataset containing FastQ files but the core parameter changes to the pipeline can be seen in the log output above.

cjw85 commented 2 years ago

Isn't the real issue here the string interpolation of the file object? Its not easy to know these things when the Groovy documentation is so poor and approximately no one in the world uses it! 🤣

drpatelh commented 2 years ago

Hey, don't shoot the messenger! This was a beast to find 🕵🏽

cjw85 commented 2 years ago

I'm just after the correct solution. We had issues before that basically ended up with "don't use path objects in string interpolation", I don't know why those lines remain not using .resolve().

This code path isn't exercised in our internal tests on AWS batch (we test on all sorts of schedulers and file systems/block stores) for these sorts of reasons.

drpatelh commented 2 years ago

Understandable 👍🏽 Personally, I tend to avoid interacting with file paths as much as possible outside of processes but I realise it's not always possible. Mainly to allow us to delegate these sorts of cross-platform weirdities to core NF functionality.

More tests FTW!!

cjw85 commented 2 years ago

Personally, I tend to avoid interacting with file paths as much as possible outside of processes but I

Us three, but unfortunately Nextflow processes cannot be used in callbacks/closures to channel operators. We couldn't find a way achieve what we wanted here through a process (hence being able to write in a more common language). Writing nextflow/groovy scripting seemed to be the only way.

cjw85 commented 2 years ago

We've added more tests to our internal CI pipelines which test the fastq_ingress.nf code which is shared across all our workflows. We'll have an updated release of wf-clone-validation shortly.