PacificBiosciences / FALCON_unzip

Making diploid assembly becomes common practice for genomic study
BSD 3-Clause Clear License
30 stars 18 forks source link

Unable to run test data #151

Closed Tcvalenzuela closed 4 years ago

Tcvalenzuela commented 5 years ago

Hey everyone and thanks for helping me, I'm trying to test the installation of Falcon in order to ensemble locally my genomes, I installed Falcon throught "conda install pb-assembly" in a new enviroment. Nevertheless, I have several problems to make run the test data "F1_bull_test_data", my config file look like this:

[job.defaults]
job_type = local
njobs = 20
[General]
input_type = raw
input_fofn = input.fofn
genome_size = 20000000
seed_coverage = 40
length_cutoff = -1
length_cutoff_pr = 12000
pa_HPCTANmask_option = -k18 -h480 -w8 -e.8 -s100
pa_HPCREPmask_option = -k18 -h480 -w8 -e.8 -s100
pa_DBsplit_option = -x500 -s200
ovlp_DBsplit_option = -s400
pa_daligner_option = -k18 -h480 -w8 -e.8 -l40 -s100
ovlp_daligner_option = -k18 -h480 -w8 -e.8 -l40 -s100

And I'm getting this in the stderr file (Last part)

HPC.daligner -P. -k18 -h480 -w8 -e.8 -l40 -s100 -v -D24 -mtan -mrep0 -mdust -H$CUTOFF -fdaligner-jobs raw_reads
+ HPC.daligner -P. -k18 -h480 -w8 -e.8 -l40 -s100 -v -D24 -mtan -mrep0 -mdust -H17886 -fdaligner-jobs raw_reads
HPC.daligner: -D is an illegal option
[WARNING]Call 'bash -vex split_db.sh' returned 256.
Traceback (most recent call last):
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/site-packages/falcon_kit/mains/dazzler.py", line 1532, in <module>
    main()
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/site-packages/falcon_kit/mains/dazzler.py", line 1528, in main
    args.func(args)
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/site-packages/falcon_kit/mains/dazzler.py", line 1084, in cmd_daligner_split
    args.split_fn, args.bash_template_fn,
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/site-packages/falcon_kit/mains/dazzler.py", line 708, in daligner_split
    io.syscall('bash -vex {}'.format(script_fn))
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/site-packages/pypeflow/io.py", line 27, in syscall
    raise Exception(msg)
Exception: Call 'bash -vex split_db.sh' returned 256.
2019-09-23 16:57:19,473 - root - WARNING - Call '/bin/bash user_script.sh' returned 256.
2019-09-23 16:57:19,473 - root - INFO - CD: '/srv/public/users/tomas/F1_bull_test_data_5/0-rawreads/daligner-split' -> '/srv/public/users/tomas/F1_bull_test_data_5/0-rawreads/daligner-split'
2019-09-23 16:57:19,473 - root - INFO - CD: '/srv/public/users/tomas/F1_bull_test_data_5/0-rawreads/daligner-split' -> '/srv/public/users/tomas/F1_bull_test_data_5/0-rawreads/daligner-split'
2019-09-23 16:57:19,474 - root - CRITICAL - Error in /srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/site-packages/pypeflow/do_task.py with args="{'json_fn': '/srv/public/users/tomas/F1_bull_test_data_5/0-rawreads/daligner-split/task.json',\n 'timeout': 30,\n 'tmpdir': None}"
Traceback (most recent call last):
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/site-packages/pypeflow/do_task.py", line 280, in <module>
    main()
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/site-packages/pypeflow/do_task.py", line 272, in main
    run(**vars(parsed_args))
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/site-packages/pypeflow/do_task.py", line 266, in run
    run_cfg_in_tmpdir(cfg, tmpdir, '.')
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/site-packages/pypeflow/do_task.py", line 241, in run_cfg_in_tmpdir
    run_bash(bash_template, myinputs, myoutputs, parameters)
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/site-packages/pypeflow/do_task.py", line 200, in run_bash
    util.system(cmd)
  File "/srv/public/users/tomas/program/Conda/envs/Falcon_19_please/lib/python3.7/site-packages/pypeflow/io.py", line 27, in syscall
    raise Exception(msg)
Exception: Call '/bin/bash user_script.sh' returned 256.
+++ pwd
++ echo 'FAILURE. Running top in /srv/public/users/tomas/F1_bull_test_data_5/0-rawreads/daligner-split (If you see -terminal database is inaccessible- you are using the python bin-wrapper, so you will not get diagnostic info. No big deal. This process is crashing anyway.)'
++ rm -f top.txt
++ which python
++ which top
++ env -u LD_LIBRARY_PATH top -b -n 1
++ env -u LD_LIBRARY_PATH top -b -n 1
++ pstree -apl
task.sh: line 10: pstree: command not found

real    0m0.598s
user    0m0.376s
sys 0m0.141s
Namespace(command=['/bin/bash', 'run.sh'], directory='/srv/public/users/tomas/F1_bull_test_data_5/0-rawreads/daligner-split', exit_file='/srv/public/users/tomas/F1_bull_test_data_5/mypwatcher/exits/exit-P03123502665eef', heartbeat_file='/srv/public/users/tomas/F1_bull_test_data_5/mypwatcher/heartbeats/heartbeat-P03123502665eef', rate=10.0)

cwd:'/srv/public/users/tomas/F1_bull_test_data_5/0-rawreads/daligner-split'
hostname=yangtze.imp.fu-berlin.de
heartbeat_fn='/srv/public/users/tomas/F1_bull_test_data_5/mypwatcher/heartbeats/heartbeat-P03123502665eef'
exit_fn='/srv/public/users/tomas/F1_bull_test_data_5/mypwatcher/exits/exit-P03123502665eef'
sleep_s=10.0
before setpgid: pid=17666 pgid=17530
 after setpgid: pid=17666 pgid=17666
In cwd: /srv/public/users/tomas/F1_bull_test_data_5/0-rawreads/daligner-split, Blocking call: '/bin/bash run.sh'
 returned: 256

I think that the problem is that in some of the script Falcon is calling "pstree -apl", but there is no option -a.

There is any way to avoid these error, are something that I'm not seen, please help

Thanks

pb-cdunn commented 5 years ago

Oops. You are getting the default values of ovlp/pa_HPCdaligner_option. Gene removed -D from HPC.daligner, and we forgot to remove that from the defaults.

You can fix this by setting:

[General]
pa_HPCdaligner_option = -v
ovlp_HPCdaligner_option = -v -l500

You might also want to add something like -B128 to both of those also. We always set -B there, so we have never run into this. But we will remove -D from defaults in the next release. Thanks for the report.