chrisjackson-pellicle / hybpiper-nf

Nextflow and Singularity/Conda pipeline for running HybPiper (https://github.com/mossmatters/HybPiper)
GNU General Public License v3.0
6 stars 2 forks source link

Main pipeline hung up on SUMMARY_STATS #14

Open laramiemckenna opened 11 months ago

laramiemckenna commented 11 months ago

I'm running the main hybpiper pipeline through Singularity and Nextflow via a SLURM workload manager. First time I ran it, the system timed out during the SUMMARY_STATS step, so I gave it more time and memory in the config file. However, I'm starting to doubt this is a memory or time issue since it has been running for over 54 hours and I only have 13 samples that are relatively small in size. Any idea what could be causing this? What kinds of files would be most helpful to provide in de-bugging something like this? Below is the last thing in my current .out file.

executor >  slurm (16)
[-        ] process > assemble:assemble_main:COMB... -
[-        ] process > assemble:assemble_main:COMB... -
[-        ] process > assemble:assemble_main:TRIM... -
[-        ] process > assemble:assemble_main:TRIM... -
[-        ] process > assemble:assemble_main:ASSE... -
[5b/78bb5b] process > assemble:assemble_main:ASSE... [100%] 13 of 13 ✔
[-        ] process > assemble:assemble_main:ASSE... -
[0d/435eb3] process > assemble:assemble_main:SUMM... [  0%] 0 of 1
[-        ] process > assemble:assemble_main:VISU... -
[dc/921e0f] process > assemble:assemble_main:RETR... [100%] 1 of 1 ✔
[3b/f8dfc6] process > assemble:assemble_main:PARA... [100%] 1 of 1 ✔
chrisjackson-pellicle commented 11 months ago

Hi @laramiemckenna,

That's odd. Can you tell me:

1) Are you running this on a multi-node HPC with a shared file system?

2) Have you tried running the pipeline again using -resume?

3) How much memory did you give the SUMMARY_STATS step? (I assume you're modifying the profile slurm_singularity in the hybpiper.config file?)

4) Did you run the pipeline using BLASTX (default) or with --bwa?

The SUMMARY_STATS step is simply running the hybpiper stats command, as you can see here. It's possible that the stats module is performing some Input/Output work that doesn't play nicely with your file system.

Can you please try running the hybpiper-nf pipeline using the read data and target file from the HybPiper test dataset, found here?

Also, can you check in the temporary work directory that Nextflow has created for the SUMMARY_STATS step (work -> 0d -> 435eb3...etc, for the run above), and tell me which files are present?

Cheers,

Chris