PacificBiosciences / FALCON_unzip

Making diploid assembly becomes common practice for genomic study
BSD 3-Clause Clear License
30 stars 18 forks source link

Quiver step of toy sample install_unzip_180725.sh #131

Closed BenjaminSchwessinger closed 6 years ago

BenjaminSchwessinger commented 6 years ago

Hi all, Thank you very much for the latest release install_unzip_180725.sh.

I now installed it and got stuck at the toy sample in the quiver step everything else looks fine. My issue is the following.

The following command crashes.

pbalign --tmpDir=/scratch/ --nproc=24 --minAccuracy=0.75 --minLength=50 --minAnchorSize=12 --maxDivergence=30 --concordant --algorithm=blasr --algorithmOption
s=--useQuality --maxHits=1 --hitPolicy=random --seed=1 /home/smrtanalysis/falcon_180725/src/FALCON-integrate/FALCON-examples/run/greg200k-sv2/4-quiver/segregate
-run/segr001/segregated/000000Fp01_00001/000000Fp01_00001.bam /home/smrtanalysis/falcon_180725/src/FALCON-integrate/FALCON-examples/run/greg200k-sv2/4-quiver/qu
iver-split/refs/000000Fp01_00001/ref.fa aln-000000Fp01_00001.bam

The issue is that the samtool sort uses the -m flag (-m 4G) which allocates memory per thread used and not overall. This crashes my set-up right now. If I change the fc_unzip.cfg file as follows pbalign still uses 24 processes.

[job.step.unzip.track_reads]
njobs=1
NPROC=4
[job.step.unzip.blasr_aln]
njobs=2
NPROC=4
[job.step.unzip.phasing]
njobs=16
NPROC=2
[job.step.unzip.hasm]
njobs=1
NPROC=12

Is there a way to change the setting for samtools sort or pbalign to use less threats/memory? Samtools works just fine when using less threats or without -m 4G flag.

Otherwise all looks good and I will go ahead with the falcon part of my assembly.

pb-cdunn commented 6 years ago

First, please look at the Binaries. The docs you're looking at are probably out-of-date.

Second, we will release a new binary today, to drop the --tmpDir=/scratch for pbalign.

Third, we'll soon look into ways to set the number of procs/threads for pbalign, and the memory for samtools. Not sure when that change will be ready.

The issue is that the samtool sort uses the -m flag (-m 4G) which allocates memory per thread used and not overall.

Fourth, thank you so much for reporting this! It hasn't harmed us because we have so much memory on our machines, but we'll look for a bug-fix soon.

If we come up with a short-term fix, I'll let you know.

BenjaminSchwessinger commented 6 years ago

Thanks Chris.

I got the latest falcon from here https://pb-falcon.readthedocs.io/en/latest/quick_start.html and wasn't sure where to raise the issue. Thanks for pointing me in the right direction.

I now installed everything successfully. I would just recommend to advise folks to run the install script when qlogged. I just installed regularly on the head node and run into troubles with memory and core usage issues. My admins weren't that happy. I fixed everything since then thanks.

pb-cdunn commented 6 years ago

I would just recommend to advise folks to run the install script when qlogged. I just installed regularly on the head node and run into troubles with memory and core usage issues. My admins weren't that happy.

Hmmm. Could you give us more info? The system is designed to be pretty efficient on the driver node. We fork Python for each qsub call (depending on which pwatcher you use), but that's only njobs*PythonSize. Runtime should be quite low because we have a geometric back-off for sleep-times between checks. Even with fs_based, we check only 1 file (run.sh.done) per running qsub job per sleep-cycle.

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 9539 cdunn     20   0 2055100  36796   8424 S   0.3  0.0   0:05.32 python

That's typical. Or

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
10768 cdunn     20   0  433660  28080   5852 S   0.3  0.0   0:00.66 python
10699 cdunn     20   0  160356   2900   1588 R   0.0  0.0   0:00.42 top
10808 cdunn     20   0   62712   7300   1940 S   0.0  0.0   0:00.01 qsub
10810 cdunn     20   0   62712   7296   1940 S   0.0  0.0   0:00.01 qsub

It spikes briefly when submitted a job -- around 10% CPU for each job -- less than a second.

Were you using job_type=local?

BenjaminSchwessinger commented 6 years ago

The Falcon dummy example works fine. The issue is with the unzip part. Right now the quiver step is hard coded for memory and thread requirements. This cased issue on my head node 24 cores and quite a bit of memory. Once this is flexible and the corresponding cfg file fixed I don't see an issue.

pb-cdunn commented 6 years ago

Yes, quiver is currently hard-coded for threads/memory. However, it's supposed to run on a distributed node. You should not run the tasks locally if you are sharing a node with others.

Maybe we mean something different by "head node"? Yes, we need to respect the limits of the remote node that the job is distributed to.

BenjaminSchwessinger commented 6 years ago

I think the confusion comes from me not being specific that I installed falcon with the script provided here https://pb-falcon.readthedocs.io/en/latest/quick_start.html. The 'install_unzip.sh' runs a toy sample at the end without distribution. This works fine for the faclon part but the unzip part is really asking for too much memory without distribution. Hope this makes more sense now.

pb-cdunn commented 6 years ago

Yes. We need to update those docs with memory requirements.