PacificBiosciences / FALCON_unzip

Making diploid assembly becomes common practice for genomic study
BSD 3-Clause Clear License
30 stars 18 forks source link

falcon_unzip error using ./unzip.sh #62

Closed shilpagarg closed 7 years ago

shilpagarg commented 7 years ago

Hi, I successfully ran Falcon and interested to do Falcon_unzip step now. After running ./unzip.sh, I get the following error:

2017-01-06 11:20:07,285 - root - DEBUG - Running "/local/data/shilpa/FALCON-integrate/pypeFLOW/pypeflow/do_task.py /local/data/shilpa/FALCON-integrate/2-asm-falcon/read_maps/dump_pread_ids/task.json"
2017-01-06 11:20:07,287 - root - DEBUG - Checking existence of '/local/data/shilpa/FALCON-integrate/2-asm-falcon/read_maps/dump_pread_ids/task.json' with timeout=60
2017-01-06 11:20:07,287 - root - DEBUG - Loading JSON from '/local/data/shilpa/FALCON-integrate/2-asm-falcon/read_maps/dump_pread_ids/task.json'
2017-01-06 11:20:07,288 - root - DEBUG - {u'inputs': {u'pread_db': u'/local/data/shilpa/FALCON-integrate/1-preads_ovl/preads.db'},
 u'outputs': {u'pread_id_file': u'pread_ids'},
 u'parameters': {},
 u'python_function': u'falcon_kit.pype_tasks.task_dump_pread_ids'}
2017-01-06 11:20:07,288 - root - DEBUG - Checking existence of u'/local/data/shilpa/FALCON-integrate/1-preads_ovl/preads.db' with timeout=60
2017-01-06 11:21:07,355 - root - CRITICAL - Error in /local/data/shilpa/FALCON-integrate/pypeFLOW/pypeflow/do_task.py with args="{'json_fn': '/local/data/shilpa/FALCON-integrate/2-asm-falcon/read_maps/dump_pread_ids/task.json',\n 'timeout': 60,\n 'tmpdir': None}"
Traceback (most recent call last):
  File "/home/sgarg/new_anaconda/anaconda3/envs/py27/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/sgarg/new_anaconda/anaconda3/envs/py27/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/local/data/shilpa/FALCON-integrate/pypeFLOW/pypeflow/do_task.py", line 186, in <module>
    main()
  File "/local/data/shilpa/FALCON-integrate/pypeFLOW/pypeflow/do_task.py", line 178, in main
    run(**vars(parsed_args))
  File "/local/data/shilpa/FALCON-integrate/pypeFLOW/pypeflow/do_task.py", line 135, in run
    wait_for(fn)
  File "/local/data/shilpa/FALCON-integrate/pypeFLOW/pypeflow/do_task.py", line 92, in wait_for
    raise Exception('Timed out waiting for {!r}'.format(fn))
Exception: Timed out waiting for u'/local/data/shilpa/FALCON-integrate/1-preads_ovl/preads.db'

real    1m0.352s
user    0m0.087s
sys     0m0.050s
 returned: 256
pb-cdunn commented 7 years ago

Does that preads.db file exist? Try re-running the bash script in /local/data/shilpa/FALCON-integrate/2-asm-falcon/read_maps/dump_pread_ids/ manually.

If yes, and if the manual run works, then you are having file-system latency problems. System issues are difficult to resolve remotely, but try use_tmpdir=/tmp in your .cfg file.

shilpagarg commented 7 years ago

'preads.db' does not exist. Moreover, I dont have read_maps in 2-asm-falcon. I performed the following steps:

cd FALCON-integrate; git checkout 1.8.5; git submodule update --init and make config-edit-user

and then cd FALCON-integrate/FALCON_unzip/ pip uninstall -v . pip install --edit .

There is no 2-asm-falcon in FALCON_unzip/examples, so I copied from FALCOM-examples. How can I fix this issue?

pb-cdunn commented 7 years ago

I think was solved already. Maybe you have an old version of FALCON or FALCON_unzip.

shilpagarg commented 7 years ago

Now I have the latest version of Falcon and Falcon_unzip, and I run the following commands:

export GIT_SYM_CACHE_DIR=~/.git-sym-cache # to speed things up git clone git://github.com/PacificBiosciences/FALCON-integrate.git cd FALCON-integrate git checkout master # or whatever version you want make init source env.sh make config-edit-user make -j all make test # to run a simple one git clone https://github.com/PacificBiosciences/FALCON_unzip --recursive; cd Falcon_unzip; python3 setup.py install;

copied 2-asm-falcon from FALCON-examples to FALCON_unzip/examples/. Then I do ./unzip.py at FALCON_unzip/examples/.

and I get the following errors: from: can't read /var/mail/falcon_unzip.unzip ../src/py_scripts/fc_unzip.py: line 5: syntax error near unexpected token sys.argv' ../src/py_scripts/fc_unzip.py: line 5: main(sys.argv)' from: can't read /var/mail/falcon_unzip.run_quiver import: unable to grab mouse ': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9199. ../src/py_scripts/fc_quiver.py: line 4: syntax error near unexpected tokensys.argv' ../src/py_scripts/fc_quiver.py: line 4: ` main(sys.argv)' How to fix this weird error? Thanks. input.fofn contains FALCON-integrate/FALCON-examples/run/synth0/data/synth5k/subreads.dexta

and unzip.sh contains ../src/py_scripts/fc_unzip.py fc_unzip.cfg ../src/py_scripts/fc_quiver.py fc_unzip.cfg

pb-cdunn commented 7 years ago

But the real problem is that we don't support python3, which I see in what you posted:

git clone https://github.com/PacificBiosciences/FALCON_unzip --recursive; cd Falcon_unzip; python3 setup.py install;

shilpagarg commented 7 years ago

yes, I added #!/usr/bin/python to fc_quiver.py and fc_unzip.py and input.fofn contains /local/data/shilpa/FALCON-integrate/FALCON-examples/run/synth0/data/synth5k/subreads.dexta

and resinstall using python setup.py install and run ./examples/unzip.sh from FALCON_unzip/

[INFO]Setup logging from file "None". [WARNING]In simple_pwatcher_bridge, pwatcher_impl=<module 'pwatcher.fs_based' from '/local/data/shilpa/FALCON-integrate/pypeFLOW/pwatcher/fs_based.pyc'> [INFO]In simple_pwatcher_bridge, pwatcher_impl=<module 'pwatcher.fs_based' from '/local/data/shilpa/FALCON-integrate/pypeFLOW/pwatcher/fs_based.pyc'> [INFO]job_type='SGE', job_queue='default', sge_option=None, use_tmpdir=None, squash=False [INFO]Num unsatisfied: 1, graph: 1 [INFO]About to submit: Node(3-unzip/reads) [INFO]starting job Job(jobid='Pac179e0cb9fc33', cmd='/bin/bash run.sh', rundir='/local/data/shilpa/FALCON-integrate/FALCON_unzip/3-unzip/reads', options={'job_queue': 'default', 'sge_option': ' -pe smp 12 -q bigmem', 'job_type': 'SGE'}) [INFO]!qsub -N Pac179e0cb9fc33 -q default -pe smp 12 -V -cwd -o stdout -e stderr -S /bin/bash /local/data/shilpa/FALCON-integrate/FALCON_unzip/mypwatcher/wrappers/run-Pac179e0cb9fc33.bash sh: 1: qsub: not found [ERROR]In pwatcher.fs_based.cmd_run(), failed to submit background-job: MetaJobSge(MetaJob(job=Job(jobid='Pac179e0cb9fc33', cmd='/bin/bash run.sh', rundir='/local/data/shilpa/FALCON-integrate/FALCON_unzip/3-unzip/reads', options={'job_queue': 'default', 'sge_option': ' -pe smp 12 -q bigmem', 'job_type': 'SGE'}), lang_exe='/bin/bash')) Traceback (most recent call last): File "/local/data/shilpa/FALCON-integrate/pypeFLOW/pwatcher/fs_based.py", line 530, in cmd_run state.submit_background(bjob) File "/local/data/shilpa/FALCON-integrate/pypeFLOW/pwatcher/fs_based.py", line 117, in submit_background bjob.submit(self, exe, script_fn) # Can raise File "/local/data/shilpa/FALCON-integrate/pypeFLOW/pwatcher/fs_based.py", line 298, in submit system(sge_cmd, checked=True) # TODO: Capture q-jobid File "/local/data/shilpa/FALCON-integrate/pypeFLOW/pwatcher/fs_based.py", line 549, in system raise Exception('{} <- {!r}'.format(rc, call)) Exception: 32512 <- 'qsub -N Pac179e0cb9fc33 -q default -pe smp 12 -V -cwd -o stdout -e stderr -S /bin/bash /local/data/shilpa/FALCON-integrate/FALCON_unzip/mypwatcher/wrappers/run-Pac179e0cb9fc33.bash' [ERROR]Failed to enqueue 1 of 1 jobs: set([Node(3-unzip/reads)]) [WARNING]Nothing is happening, and we had 0 failures. Should we quit? Instead, we will just sleep. [INFO]sleep 0.1s Traceback (most recent call last): File "/home/sgarg/new_anaconda/anaconda3/envs/py27/bin/fc_unzip.py", line 4, in import('pkg_resources').run_script('falcon-unzip==0.4.0', 'fc_unzip.py') File "/local/data/shilpa/FALCON-integrate/fc_env/lib/python2.7/site-packages/pkg_resources/init.py", line 739, in run_script self.require(requires)[0].run_script(script_name, ns) File "/local/data/shilpa/FALCON-integrate/fc_env/lib/python2.7/site-packages/pkg_resources/init.py", line 1494, in run_script exec(code, namespace, namespace) File "/home/sgarg/new_anaconda/anaconda3/envs/py27/lib/python2.7/site-packages/falcon_unzip-0.4.0-py2.7.egg/EGG-INFO/scripts/fc_unzip.py", line 4, in main(sys.argv) File "/home/sgarg/new_anaconda/anaconda3/envs/py27/lib/python2.7/site-packages/falcon_unzip-0.4.0-py2.7.egg/falcon_unzip/unzip.py", line 384, in main unzip_all(config) File "/home/sgarg/new_anaconda/anaconda3/envs/py27/lib/python2.7/site-packages/falcon_unzip-0.4.0-py2.7.egg/falcon_unzip/unzip.py", line 222, in unzip_all with open('./3-unzip/reads/ctg_list') as f: IOError: [Errno 2] No such file or directory: './3-unzip/reads/ctg_list' [INFO]Setup logging from file "None". [INFO]config={'input_bam_fofn': '/local/data/shilpa/FALCON-integrate/FALCON_unzip/input_bam.fofn', 'job_queue': 'default', 'job_type': 'SGE', 'pwatcher_type': 'fs_based', 'sge_quiver': ' -pe smp 24 -q bigmem ', 'sge_track_reads': ' -pe smp 12 -q bigmem', 'smrt_bin': '/mnt/secondary/builds/full/3.0.0/prod/smrtanalysis_3.0.0.153854/smrtcmds/bin/'} [WARNING]In simple_pwatcher_bridge, pwatcher_impl=<module 'pwatcher.fs_based' from '/local/data/shilpa/FALCON-integrate/pypeFLOW/pwatcher/fs_based.pyc'> [INFO]In simple_pwatcher_bridge, pwatcher_impl=<module 'pwatcher.fs_based' from '/local/data/shilpa/FALCON-integrate/pypeFLOW/pwatcher/fs_based.pyc'> [INFO]job_type='SGE', job_queue='default', sge_option=None, use_tmpdir=None, squash=False [INFO]Num unsatisfied: 2, graph: 2 [INFO]About to submit: Node(4-quiver/reads) [INFO]starting job Job(jobid='P1f596022dfbb93', cmd='/bin/bash run.sh', rundir='/local/data/shilpa/FALCON-integrate/FALCON_unzip/4-quiver/reads', options={'job_queue': 'default', 'sge_option': ' -pe smp 12 -q bigmem', 'job_type': 'SGE'}) [INFO]!qsub -N P1f596022dfbb93 -q default -pe smp 12 -V -cwd -o stdout -e stderr -S /bin/bash /local/data/shilpa/FALCON-integrate/FALCON_unzip/mypwatcher/wrappers/run-P1f596022dfbb93.bash sh: 1: qsub: not found [ERROR]In pwatcher.fs_based.cmd_run(), failed to submit background-job: MetaJobSge(MetaJob(job=Job(jobid='P1f596022dfbb93', cmd='/bin/bash run.sh', rundir='/local/data/shilpa/FALCON-integrate/FALCON_unzip/4-quiver/reads', options={'job_queue': 'default', 'sge_option': ' -pe smp 12 -q bigmem', 'job_type': 'SGE'}), lang_exe='/bin/bash')) Traceback (most recent call last): File "/local/data/shilpa/FALCON-integrate/pypeFLOW/pwatcher/fs_based.py", line 530, in cmd_run state.submit_background(bjob) File "/local/data/shilpa/FALCON-integrate/pypeFLOW/pwatcher/fs_based.py", line 117, in submit_background bjob.submit(self, exe, script_fn) # Can raise File "/local/data/shilpa/FALCON-integrate/pypeFLOW/pwatcher/fs_based.py", line 298, in submit system(sge_cmd, checked=True) # TODO: Capture q-jobid File "/local/data/shilpa/FALCON-integrate/pypeFLOW/pwatcher/fs_based.py", line 549, in system raise Exception('{} <- {!r}'.format(rc, call)) Exception: 32512 <- 'qsub -N P1f596022dfbb93 -q default -pe smp 12 -V -cwd -o stdout -e stderr -S /bin/bash /local/data/shilpa/FALCON-integrate/FALCON_unzip/mypwatcher/wrappers/run-P1f596022dfbb93.bash' [ERROR]Failed to enqueue 1 of 1 jobs: set([Node(4-quiver/reads)]) [WARNING]Nothing is happening, and we had 0 failures. Should we quit? Instead, we will just sleep. [INFO]sleep 0.1s Traceback (most recent call last): File "/home/sgarg/new_anaconda/anaconda3/envs/py27/bin/fc_quiver.py", line 4, in import('pkg_resources').run_script('falcon-unzip==0.4.0', 'fc_quiver.py') File "/local/data/shilpa/FALCON-integrate/fc_env/lib/python2.7/site-packages/pkg_resources/init.py", line 739, in run_script self.require(requires)[0].run_script(script_name, ns) File "/local/data/shilpa/FALCON-integrate/fc_env/lib/python2.7/site-packages/pkg_resources/init.py", line 1494, in run_script exec(code, namespace, namespace) File "/home/sgarg/new_anaconda/anaconda3/envs/py27/lib/python2.7/site-packages/falcon_unzip-0.4.0-py2.7.egg/EGG-INFO/scripts/fc_quiver.py", line 4, in main(sys.argv) File "/home/sgarg/new_anaconda/anaconda3/envs/py27/lib/python2.7/site-packages/falcon_unzip-0.4.0-py2.7.egg/falcon_unzip/run_quiver.py", line 376, in main p_ctg_out, h_ctg_out, job_done_plfs = create_quiver_jobs(wf, scattered_quiver_plf) File "/home/sgarg/new_anaconda/anaconda3/envs/py27/lib/python2.7/site-packages/falcon_unzip-0.4.0-py2.7.egg/falcon_unzip/run_quiver.py", line 216, in create_quiver_jobs jobs = json.loads(open(scattered_quiver_fn).read()) IOError: [Errno 2] No such file or directory: '/local/data/shilpa/FALCON-integrate/FALCON_unzip/4-quiver/quiver_scatter/scattered.json'

do I run from the wrong folder? Is there anything else I need to update?

pb-cdunn commented 7 years ago
[INFO]!qsub -N Pac179e0cb9fc33 -q default -pe smp 12 -V -cwd -o stdout -e stderr -S /bin/bash /local/data/shilpa/FALCON-integrate/FALCON_unzip/mypwatcher/wrappers/run-Pac179e0cb9fc33.bash
sh: 1: qsub: not found

You need to solve that yourself. Maybe you need to set job_type=lsf or whatever you are using, since you lack qsub.

To run locally, on a single machine, try:

job_type = string
job_queue = /bin/bash -c "${CMD}" > "${STDOUT_FILE}" 2> "${STDERR_FILE}"
pwatcher_type = blocking
shilpagarg commented 7 years ago

Thanks, I am working on example synth0 because that is publicly available. I am interested to run FALCON_unzip on it, looks that the reads are in dexta format. Therefore, I get this in stderr:

[26701]finished run_tr_stage1('/local/data/shilpa/FALCON-integrate/FALCON_unzip/examples/1-preads_ovl/preads.db', '/local/data/shilpa/FALCON-integrate/FALCON_unzip/examples/1-preads_ovl/m_00001/preads.1.las', 2500, 40, dict(50 elem)) [26700]finished track_reads

mkdir -p 3-unzip/reads/

python -m falcon_kit.mains.fetch_reads

What shall I need to update in Falcon_unzip so that it can handle dexta files?

pb-cdunn commented 7 years ago

Right. I haven't yet updated unzip similar to falcon to allow it to operate directly on .dexta files. But if you install FALCON-integrate, you will have already have the program undexta, from the DEXTRACT repo (originally from thegenemysers, cloned into PacificBiosciences). For each .dexta, do this:

undexta -kv foo.dexta

That will keep the .dexta file and produce the original .fasta. Then, alter the .fofn file to refer to foo.fasta instead of foo.dexta.

shilpagarg commented 7 years ago

The falcon_unzip pipeline worked without any error using ./unzip.sh

Input: in 2-asm-falcon/sg_edges_list is non-empty. input.fofn: FALCON-examples/run/synth0/data/synth5k/subreads.fasta input_bam.fofn: FALCON-examples/run/synth0/data/synth5k/synth5k.bam

How can I get the phased reads and two haplotypes? To see that I look into the folder 3-unzip and most of the files are empty, for example: all_phased_reads

Am I looking in wrong folder? why the files are empty?

shilpagarg commented 7 years ago

I am also interested to run Falcon_unzip on my input dataset. Shall I just need to change input.fofn and input_bam.fofn? Is there anything else I should be doing?

yilunhuangyue commented 7 years ago

hi, Have you solved the problem. I want to install falcon-unzip to assembly a genome, but I don't known where I can download smrt-analysis 3.0 or newer version. The links in issue #11 seems useless. could you tell me where I can download it? Thanks so much!

pb-jchin commented 7 years ago

@yilunhuangyue If there is an email address, PacBio's product management team could send you a link.

shilpagarg commented 7 years ago

No, I am not able to do it. Jason: could you please send me link as well at sgarg@mpi-inf.mpg.de thanks.

On Feb 8, 2017 23:35, "Jason Chin" notifications@github.com wrote:

@yilunhuangyue https://github.com/yilunhuangyue If there is an email address, PacBio's product management team could send you a link.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/PacificBiosciences/FALCON_unzip/issues/62#issuecomment-278484718, or mute the thread https://github.com/notifications/unsubscribe-auth/AHEq5w4J3ko4VJLI3qJlebCt96yY9TtGks5rakNMgaJpZM4LclyD .

yilunhuangyue commented 7 years ago

@pb-jchin, yilunhuangyue@webmail.hzau.edu.cn. it is my email address, thanks a lot

sebastiangornik commented 7 years ago

@pb-jchin, I would appreciate a link to download smrt-analysis 3.x as well as I want to run FALCON_unzip to optimize a diploid genome assembly from FALCON. Need to make those bam files from my raw reads!

e-mail is: sebastian.gornik@gmail.com

Thanks!

pb-cdunn commented 7 years ago

Same question as #68. Please post if you find a better answer! We do not control the business or marketing teams. Worst-case, you can build the tools yourself using https://github.com/PacificBiosciences/pitchfork/wiki

mhsieh commented 7 years ago

You normally can get a copy of our released software from our supports / marketing team.