bio-raum / FooDMe2

A nextflow pipeline for the identification of species from mixed samples based on mitochondrial amplicons
https://bio-raum.github.io/FooDMe2/latest
GNU General Public License v3.0
2 stars 1 forks source link

[Bug] CI test #48

Closed gregdenay closed 1 month ago

gregdenay commented 1 month ago

THe github action for pipeline testing fails because of a problem with singularity: https://github.com/bio-raum/FooDMe2/actions/runs/10285596571/job/28464249170?pr=47

Pulling Singularity image https://depot.galaxyproject.org/singularity/ubuntu:20.04 [cache /home/runner/work/FooDMe2/FooDMe2/./depot.galaxyproject.org-singularity-ubuntu-20.04.img]
ERROR ~ Error executing process > 'BUILD_REFERENCES:UNTAR_TAXONOMY (1)'

Caused by:
  Failed to pull singularity image
    command: singularity pull  --name depot.galaxyproject.org-singularity-ubuntu-20.04.img.pulling.1723038540650 https://depot.galaxyproject.org/singularity/ubuntu:20.04 > /dev/null
    status : 127
    hint   : Try and increase singularity.pullTimeout in the config (current is "20m")
    message:
      bash: line 1: singularity: command not found
gregdenay commented 1 month ago

Fixed the singularity problem. However using the build_reference workflow is too demanding for the allocated space (max 15Gb total) - even with --skip_genbank.

Even with keeping it to a minimum, there is about 1Gb of databses required, to which we have to add the test samples. It seems to much to download on every CI run.

I only see these solutions:

marchoeppner commented 1 month ago

Good point, I forgot about the databases tbh, because originally eutaxpro didn't need such an extensive reference ecosystem.

I generally like to have a pipeline test, but in this particular instance - because we have an easy way to test the pipeline offline and are only 2 people - I'd be ok with disabling this for the time being.

gregdenay commented 1 month ago

I added a very small test set adapted from FooDMe1. It is about 2Mb in total with 3 Mock samples and stripped down databases.

It should be enough to quickly spot big problems in the PR but doesn't replace thorough testing.

Commit b6e4050

gregdenay commented 1 month ago

I'll add a conda test case too, just to be sure

marchoeppner commented 1 month ago

In case you haven't hunted this down yet, the conda faiure is here: https://github.com/bio-raum/FooDMe2/blob/568332e44e5578fd975dc776b26b86ba15a10dba/modules/dada2/quality/main.nf#L5

This should read:

conda "${moduleDir}/environment.yml"

gregdenay commented 1 month ago

Tests are now successful. https://github.com/bio-raum/FooDMe2/actions/runs/10301426080

We could add some more for docker, vsearch etc... We also have to start thinking about what we want to do about nonapore when we will be there.

marchoeppner commented 1 month ago

Good job!

Happy to add my review to the PR; is it ready now? Seems it is still stuck in failure, but I guess the tests need to be re-triggeded?

As for more tests, I suppose Github will start coming for our throats if we burn more compute time ;) In principle you are right though. But let's crossed that bridge at a later point.

Nanopore, well - once we have good data, I hope we can make some progress on that front as well. Happy to get the 1.0 release out first though (really close now!)

gregdenay commented 1 month ago

Need to manually trigger the run from the dev branch to see it. The PR always use the workflows from the main branch, which are broken...

Marking this as resolved for now