tecangenomics / nudup

NuDup -- Marks/removes duplicate molecules based on the molecular tagging technology used in Tecan products.
http://www.tecangenomics.com
GNU Lesser General Public License v3.0
14 stars 9 forks source link

nudup instances loop endlessly when run in parallel inside a container #18

Open AdrianS85 opened 5 years ago

AdrianS85 commented 5 years ago

Dear all,

I was trying to run 20 bam files through nudup.py in parallel inside Singularity container. I used parallel processing as implemented in snakemake, nextflow and parallel GNU. The effect was always the same: some of the files were processed, while others ran (or just stay hanged?) forever. I have logged the paralell's GNU stdout/err during execution of nudup script - https://github.com/AdrianS85/varia/blob/master/nudup_raport.txt. Please notice that the same jobs are issued over and over again. For comparison, the strip script, which is run before nudup.py, runs in parallel properly - https://github.com/AdrianS85/varia/blob/master/strip_raport.txt

I will be thankful for any help, Adrian

EDIT: Program also hangs when running it in parallel outside the container. Ive also produced another raport for failed jobs: https://github.com/AdrianS85/varia/blob/master/nudup_errors_more

https://github.com/nugentechnologies/nudup/blob/468c62e948b87fa8a3231355d26cc37e1d0717d5/nudup.py#L1

mlovci commented 5 years ago

Hi @AdrianS85 - I am not familiar with Singularity or snakemake, but we can try to help. A few things to check:

AdrianS85 commented 5 years ago

Dear mlovci,

I also tried to run only 5 files in parallel in case of RAM issues, but it still seems to work/fail randomly, sometimes after producing 16 files, sometimes after single one.

Best, Adrian