PacificBiosciences / FALCON_unzip

Making diploid assembly becomes common practice for genomic study
BSD 3-Clause Clear License
30 stars 18 forks source link

getting errors in run_quiver.py step in falcon_unzip #61

Closed bostanict closed 7 years ago

bostanict commented 7 years ago

I ran falcon_unzip:

!/bin/sh

fc_unzip.py fc_unzip.cfg; fc_quiver.py fc_unzip.cfg;

fc_unzip.py fc_unzip.cfg went ok and I got all the result in the 3-unzip folder when running the second part (fc_quiver.py fc_unzip.cfg), it creates the folder 4-quiver and the folders inside, starts running the sh files. after some of them are done, it runs into errors. our scheduler is SLURM and I specified it in the config file with the parameters as I used for falcon and worked.

here is the error:

Submitted batch job 4189714
Submitted batch job 4189715
Submitted batch job 4189716
[INFO]Queued 'task://localhost/q_000051F' ...
[INFO]Queued 'task://localhost/q_000166F' ...
[INFO]Queued 'task://localhost/q_000215F_001' ...
[INFO]Queued 'task://localhost/q_000005F_001' ...
[INFO]Queued 'task://localhost/q_000157F' ...
[INFO]Queued 'task://localhost/q_000049F_002' ...
[INFO]Queued 'task://localhost/q_000015F' ...
[INFO]Queued 'task://localhost/q_000044F' ...
[INFO]Queued 'task://localhost/q_000022F' ...
[INFO]Queued 'task://localhost/q_000001F_002' ...
[INFO]Queued 'task://localhost/q_000237F_001' ...
[INFO]Queued 'task://localhost/q_000053F_001' ...
[INFO]Queued 'task://localhost/q_000098F_001' ...
Submitted batch job 4189718
[INFO](SLURM) '/home3/hbostan/falcon-test/blueberry-auto/./4-quiver/000053F_001/cns_000053F_001.sh'
[INFO]tick: 2, #updatedTasks: 64, sleep_time=0.000000
Submitted batch job 4189719
[INFO]tick: 4, #updatedTasks: 64, sleep_time=0.200000
[INFO]tick: 8, #updatedTasks: 64, sleep_time=0.600000
[INFO]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000189F_001/000189F_001_quiver_done.exit' found.
[WARNING]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000189F_001/000189F_001_quiver_done' is missing. job: 'cns_000189F_001.sh-q_000189F_001-q_000189F_001' failed!
[INFO]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000024F/000024F_quiver_done.exit' found.
[WARNING]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000024F/000024F_quiver_done' is missing. job: 'cns_000024F.sh-q_000024F-q_000024F' failed!
[INFO]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000065F_001/000065F_001_quiver_done.exit' found.
[WARNING]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000065F_001/000065F_001_quiver_done' is missing. job: 'cns_000065F_001.sh-q_000065F_001-q_000065F_001' failed!
[INFO]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000162F_001/000162F_001_quiver_done.exit' found.
[WARNING]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000162F_001/000162F_001_quiver_done' is missing. job: 'cns_000162F_001.sh-q_000162F_001-q_000162F_001' failed!
[INFO]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000073F_002/000073F_002_quiver_done.exit' found.
[WARNING]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000073F_002/000073F_002_quiver_done' is missing. job: 'cns_000073F_002.sh-q_000073F_002-q_000073F_002' failed!
[INFO]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000187F/000187F_quiver_done.exit' found.
[WARNING]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000187F/000187F_quiver_done' is missing. job: 'cns_000187F.sh-q_000187F-q_000187F' failed!
[INFO]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000000F_003/000000F_003_quiver_done.exit' found.
[WARNING]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000000F_003/000000F_003_quiver_done' is missing. job: 'cns_000000F_003.sh-q_000000F_003-q_000000F_003' failed!
[INFO]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000083F/000083F_quiver_done.exit' found.
[WARNING]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000083F/000083F_quiver_done' is missing. job: 'cns_000083F.sh-q_000083F-q_000083F' failed!
[INFO]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000161F/000161F_quiver_done.exit' found.
[WARNING]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000161F/000161F_quiver_done' is missing. job: 'cns_000161F.sh-q_000161F-q_000161F' failed!
[INFO]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000210F_001/000210F_001_quiver_done.exit' found.
[WARNING]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000210F_001/000210F_001_quiver_done' is missing. job: 'cns_000210F_001.sh-q_000210F_001-q_000210F_001' failed!
[INFO]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000151F_001/000151F_001_quiver_done.exit' found.
[WARNING]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000151F_001/000151F_001_quiver_done' is missing. job: 'cns_000151F_001.sh-q_000151F_001-q_000151F_001' failed!
[INFO]Failure ('fail'). Joining 'task://localhost/q_000189F_001'...
[INFO]Failure ('fail'). Joining 'task://localhost/q_000024F'...
[INFO]Failure ('fail'). Joining 'task://localhost/q_000065F_001'...
[INFO]Failure ('fail'). Joining 'task://localhost/q_000162F_001'...
[INFO]Failure ('fail'). Joining 'task://localhost/q_000073F_002'...
[INFO]Failure ('fail'). Joining 'task://localhost/q_000187F'...
[INFO]Failure ('fail'). Joining 'task://localhost/q_000000F_003'...
[INFO]Failure ('fail'). Joining 'task://localhost/q_000083F'...
[INFO]Failure ('fail'). Joining 'task://localhost/q_000161F'...
[INFO]Failure ('fail'). Joining 'task://localhost/q_000210F_001'...
[INFO]Failure ('fail'). Joining 'task://localhost/q_000151F_001'...
[INFO]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000077F_001/000077F_001_quiver_done.exit' found.
[WARNING]'/home3/hbostan/falcon-test/blueberry-auto/4-quiver/000077F_001/000077F_001_quiver_done' is missing. job: 'cns_000077F_001.sh-q_000077F_001-q_000077F_001' failed!
[CRITICAL]Any exception caught in RefreshTargets() indicates an unrecoverable error. Shutting down...
/usr/local/bin/fc_env/lib/python2.7/site-packages/pypeflow-0.1.1-py2.7.egg/pypeflow/controller.py:537: UserWarning:
            "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
            "! Please wait for all threads / processes to terminate !"
            "! Also, maybe use 'ps' or 'qstat' to check all threads,!"
            "! processes and/or jobs are terminated cleanly.        !"
            "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"

I also re-ran the fc_unzip.sh but same errors occur.

pb-jchin commented 7 years ago

please submit a ticket to PacBio Techsupport team. see this also https://github.com/PacificBiosciences/FALCON_unzip/wiki

bostanict commented 7 years ago

Hi Jason,

So does it mean that we will not be able to ask questions via github anymore and must submit a ticket from now on or this is a specific case?

Thanks for any hint in advance~

pb-jchin commented 7 years ago

This question needs more detailed investigation and it is not a development question. Also, it will be useful to make our product management/support team knowing you are solving an important problem that PacBio can help as a company. (If you know my email, send me one I will forward it to some people)

bostanict commented 7 years ago

Thanks Jason,

Your help and other people in Pacbio is always instant and resolving. We came this far because of your technical and scientific support.

I did not know this is not a development issue since I am following the instructions ans saw the pipeline did not proceed to the end. I tough there might be some things that should be tweaked in the code.

I will make a ticket as you advised and see how it goes, Best

pb-jchin commented 7 years ago

well, I like to thank your support. GitHub issues are supposed for discussion about code and possible design choices from development contributors. When a run fail, there are many possible reasons outside the scope of the code, e.g., environment and the data itself. Unless those are documented as well, it is hard to know what happens. I will advocate to have better way to help beside GitHub.

pb-cdunn commented 7 years ago

You'd have to look into the directories of those failed tasks to find stderr. (Maybe under pwatcher.dir/stderr.) Could be something simple. I've pushed some recent fixes too, in FALCON_unzip and in pypeFLOW.

bostanict commented 7 years ago

Hi @pb-cdunn ,

Thanks a lot for the reply. I also already submitted a ticket to the technical support.

(Maybe under pwatcher.dir/stderr.)

before I run the unzip, I removed the pwacher directory as advised in another post. After I run the Unzip, this directory is not created any more...

find stderr

In the folders in 4-quiver, or the job is done and there is result, or there is the sh_done_exit or there is only the fasta file. I do not see any stderr or stdout file...

thanks alot

pb-cdunn commented 7 years ago

How up-to-date is your FALCON/FALCON_unzip/pypeFLOW?

What are the contents of the "exit" file for that job in mypwatcher/? It should tell you the actual exit-code. Python might multiply by 256, but otherwise they should agree. If they don't, then the problem might be filesystem latency for the exit file, which we would have to address somehow.