aidenlab / juicer

A One-Click System for Analyzing Loop-Resolution Hi-C Experiments
http://aidenlab.org
MIT License
402 stars 180 forks source link

Juicer align does not wait for splitting to finish #265

Open dmacguigan opened 2 years ago

dmacguigan commented 2 years ago

My labmate and I have encountered an issue with the SLURM version of Juicer v.1.6. We are using a Linux computing cluster: LSB Version: :core-4.1-amd64:core-4.1-noarch Distributor ID: CentOS Description: CentOS Linux release 7.9.2009 (Core) Release: 7.9.2009

It appears that the alignment step and all subsequent steps do not wait for the FASTQ splitting step to finish. To investigate, I made some modifications at line 700 in the juicer.sh script:

echo "dependsplit = ${dependsplit}"
date
echo "starting wait srun"
echo "srun -c 1 -p $queue -t 1 -o $debugdir/wait-%j.out -e $debugdir/wait-%j.err -d $dependsplit -J ${groupname}_wait sleep 1"
srun -c 1 -p "$queue" -t 1 -o $debugdir/wait-%j.out -e $debugdir/wait-%j.err -d $dependsplit -J "${groupname}_wait" sleep 1
date
echo "finished wait srun"

Here is the relevant output from my modified copy of juicer.sh:

dependsplit = afterok:7647558:7647559 
Tue Feb  8 16:12:00 EST 2022
starting wait srun
srun -c 1 -p general-compute -t 1 -o /projects/academic/tkrabben/NJCB/ca_gen_1_juicer_purged_new_pipeline_02-08-2022/debug/wait-%j.out -e /projects/academic/tkrabben/NJCB/ca_gen_1_juicer_purged_new_pipeline_02-08-2022/debug/wait-%j.err -d afterok:7647558:7647559 -J a1644354720_wait sleep 1
Tue Feb  8 16:12:02 EST 2022
finished wait srun

The "srun" command should hold until the splitting batch jobs are complete. However, this does not seem to function properly. In the job queue, we can see 2 splitting jobs running simultaneously with an align job. From my understanding, this should not happen.

[dmacguig@vortex2:~/project/NJCB/ca_gen_1_juicer_purged_new_pipeline_02-08-2022]$ squeue -u njbacken
             JOBID PARTITION                           NAME     USER ST       TIME  NODES     NODELIST(REASON)
           7647560 general-c a1644354720_**.fastq_Count_Lig njbacken CG       0:01      1        cpn-d09-27-01
           7647564 general-c          a1644354720_fragmerge njbacken PD       0:00      1         (Dependency)
           7647574 general-c              a1644354720_hic30 njbacken PD       0:00      1         (Dependency)
           7647573 general-c                a1644354720_hic njbacken PD       0:00      1         (Dependency)
           7647562 general-c     a1644354720_merge_**.fastq njbacken PD       0:00      1         (Dependency)
           7647572 general-c              a1644354720_stats njbacken PD       0:00      1         (Dependency)
           7647571 general-c              a1644354720_stats njbacken PD       0:00      1         (Dependency)
           7647570 general-c              a1644354720_stats njbacken PD       0:00      1         (Dependency)
           7647576 general-c     a1644354720_arrowhead_wrap njbacken PD       0:00      1         (Dependency)
           7647575 general-c       a1644354720_hiccups_wrap njbacken PD       0:00      1         (Dependency)
           7647567 general-c         a1644354720_post_dedup njbacken PD       0:00      1         (Dependency)
           7647563 general-c              a1644354720_check njbacken PD       0:00      1         (Dependency)
           7647577 general-c          a1644354720_prep_done njbacken PD       0:00      1         (Dependency)
           7647566 general-c              a1644354720_dedup njbacken PD       0:00      1         (Dependency)
           7647569 general-c           a1644354720_prestats njbacken PD       0:00      1         (Dependency)
           7647568 general-c           a1644354720_dupcheck njbacken PD       0:00      1         (Dependency)
           7647565 general-c        a1644354720_dedup_guard njbacken PD       0:00      1        (JobHeldUser)
           7647561 general-c    a1644354720_align1_**.fastq njbacken  R       0:01      1        cpn-d07-29-02
           7647558 general-c a1644354720_split_/projects/ac njbacken  R       0:01      1        cpn-d09-04-01
           7647559 general-c a1644354720_split_/projects/ac njbacken  R       0:01      1        cpn-d09-04-01
           7647557 general-c                a1644354720_cmd njbacken  R       0:01      1        cpn-d09-04-01

This is a problem because the remaining Juicer steps only use a subset of the total reads. In ./aligned/inter.txt we see that Juicer only uses the first chunk of 22,500,000 read pairs.

If we run Juicer again using previously generated split FASTQ files, Juicer functions properly and ./aligned/inter.txt reports 205,807,309 read pairs.

I noticed at least one other user might have a similar issue, considering their inter.txt file also reports exactly 22,500,000 read pairs (https://groups.google.com/u/1/g/3d-genomics/c/3_Ok2-CydmU/m/almfv6tjAAAJ).

Do you have any advice on how to proceed? Is it possible this is a bug in the SLURM version of Juicer.sh? Happy to share our files and scripts if needed.