PacificBiosciences / FALCON_unzip

Making diploid assembly becomes common practice for genomic study
BSD 3-Clause Clear License
30 stars 18 forks source link

Choose unzip_concurrent_jobs and quiver_concurrent_jobs parameters. #54

Closed a-velt closed 7 years ago

a-velt commented 7 years ago

Dear developers,

I have a question regarding the two options unzip_concurrent_jobs and quiver_concurrent_jobs in the fc_unzip.cfg file. I don't have SGE, so I launch falcon_unzip in local mode.

I want to launch falcon-unzip on a node with 24 CPUs but I don't know how many CPUs takes each step, by default, so it is difficult to choose how many concurrent jobs I can run.

It's different than falcon-integrate, which allows to choose CPUs number for each step, so it is simple to choose the number of concurrent jobs.

I have chosen 5 concurrent jobs, but maybe I underestimate this parameter, which slows the analysis.

Do you have any tips on this?

Best, Amandine

a-velt commented 7 years ago

Dear developers,

Sorry to ask a whole other question.

Falcon-unzip failed at the step 1-hasm because it searched the "../../2-asm-falcon/las.fofn" file and don't found it.

My Falcon-integrate run have runned without error previously, generating p_ctg.fa and a_ctg.fa files. But I have no las.fofn in the folder 2-asm-falcon.

I have two other files of this type : 0-rawreads/merge-gather/las.fofn 1-preads_ovl/merge-gather/las.fofn

So I can create a symbolic link in the 2-asm-falcon folder, but I don't know which las.fofn I have to use ...

If someone can help me.

Thank you in advance and have a nice day.

Best, Amandine

EDIT : Correct me if needed, but I think I have to use 1-preads_ovl/merge-gather/las.fofn. I created the symbolic link in 2-asm-falcon folder, remove 1-hasm folder, re-run Falcon-unzip, and it runs well for the moment ...

pb-cdunn commented 7 years ago

At the moment, these settings are the number of "jobs", not the number of "cpus". Yes, that's a problem for running locally.

Jason, how many cpus are needed for each job? I have these settings from your old run:

sge_phasing= -pe smp 12 -q bigmem
sge_track_reads= -pe smp 12 -q default
sge_blasr_aln=  -pe smp 24 -q bigmem
sge_hasm=  -pe smp 48 -q bigmem

sge_quiver= -pe smp 12 -q sequel-farm

By the above, concurrency should be ncpus/12 for the Quiver stage, but it needs different values for different unzip steps.

If we specify the number of cpus when we generate a PypeTask (which I've planned to do anyway), then we can switch to new max_concurrency_<step> settings which refer to the actual number of CPUs, rather than to the number of jobs.

pb-cdunn commented 7 years ago

Amandine, your second comment is a separate problem. You need the latest FALCON and PYPEflow to use the latest FALCON_unzip.

We include FALON_unzip in the latest FALCON-integrate, so you would have consistent versions of everything if you base your installation on FALCON-integrate. The wiki shows how to install Falcon, but for now you still need to install FALCON_unzip yourself, using the git-submodule. (I hope that's clear.)

a-velt commented 7 years ago

pb-cdunn, I have installed Falcon-integrate the 15th november, from git clone git://github.com/PacificBiosciences/FALCON-integrate.git.

git clone git://github.com/PacificBiosciences/FALCON-integrate.git
cd FALCON-integrate
git checkout master  # or whatever version you want
make init
source env.sh
make config-edit-user
make -j all
make test 

Then, I have installed FALCON-unzip, the same day, from git clone https://github.com/PacificBiosciences/FALCON_unzip.git.

git clone https://github.com/PacificBiosciences/FALCON_unzip.git
export PATH=MY_PYTHONPATH:$PATH
python setup.py install

So, I think I have the latest versions no ? Before launching falcon-unzip, I sourced the falcon-integrate environment and my python environment, in order to see all the needed tools ... I don't know if it's the good way.

pb-cdunn commented 7 years ago

Falcon-unzip failed at the step 1-hasm because it searched the "../../2-asm-falcon/las.fofn" file and don't found it.

That moved. Your versions are not consistent.

Try FALCON-integrate/1.8.4, which includes FALCON_unzip as a submodule. You will have to install FALCON_unzip separate (for now), but at least the code is consistent.

Or simply update FALCON_unzip.