Closed YourePrettyGood closed 7 years ago
No, PypeTasks aren't really needed for that script. But it should still run. You have some kind of version mis-match. I see this:
python -m falcon_kit.mains.get_read_ctg_map
You report this:
FALCON_unzip: c936cb9
I do not believe that you are actually using that version.
26034b9f falcon_unzip/unzip.py (Christopher Dunn 2016-11-21 48) python -m falcon_kit.mains.get_read_ctg_map
That commit is included under yours:
* 1820d82 (HEAD, origin/master, origin/HEAD, master) Rm copyrighted script
* c936cb9 Drop FALCON submodule
* b04f104 Drop TaskBase, URL from PypeTask
* 8e4f06e Moved README into wiki, so we do not need to edit the code-tree
* 22c492e Merge branch 'simple' into master
|\
| * d2dd81a Update README.md
| * 1bb0010 Fix some task dependencies
| * a532bb4 unzip works now
| * aa0700b fix read_map dirs
| * 7a10666 Drop setNumThreadAllowed, old pypeflow use
| * e1d73e6 PEP-8 spacing
| * 60d88aa single quotes
| * 26034b9 simpler script def
...
So I think you have an integration problem, which is by far the most difficult thing for us to address remotely.
Note the FALCON-integrate/FALCON-make does not yet install FALCON_unzip. You still have to do that yourself. (I will address that this weekend.)
Sorry, that was a typo on my part, I did run python -m falcon_kit.mains.get_read_ctg_map
and it produced the AttributeError as stated above.
Here's the first commit from the output of git log in my FALCON_unzip directory (which I had to run python setup.py install --prefix=[path to fc_env subdirectory of FALCON-integrate folder]
in after installing FALCON-integrate, and don't see any errors from python setup.py install):
$ git log commit c936cb94a63cc763f9733d2f497ccc00a68ba02c Author: Christopher Dunn cdunn2001@gmail.com Date: Sun Nov 27 12:19:30 2016 -0600
Drop FALCON submodule
Here's the command and output generated by running that manually:
$ python -m falcon_kit.mains.get_read_ctg_map WARNING:pypeflow.simple_pwatcher_bridge:In simple_pwatcher_bridge, pwatcher_impl=<module 'pwatcher.fs_based' from '/Genomics/grid3/users/preilly/bin/FALCON_0.7.0/FALCON-integrate/pypeFLOW/pwatcher/fs_based.pyc'> ERROR:pypeflow.simple_pwatcher_bridge:Task Node(2-asm-falcon/read_maps/dump_rawread_ids) failed with exit-code=256 ERROR:pypeflow.simple_pwatcher_bridge:Some tasks are recently_done but not satisfied: set([Node(2-asm-falcon/read_maps/dump_rawread_ids)]) ERROR:pypeflow.simple_pwatcher_bridge:ready: set([]) submitted: set([Node(2-asm-falcon/read_maps/dump_pread_ids)]) Traceback (most recent call last): File "/usr/local/python/2.7.12/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/local/python/2.7.12/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/Genomics/grid3/users/preilly/bin/FALCON_0.7.0/FALCON-integrate/FALCON/falcon_kit/mains/get_read_ctg_map.py", line 137, in
main() File "/Genomics/grid3/users/preilly/bin/FALCON_0.7.0/FALCON-integrate/FALCON/falcon_kit/mains/get_read_ctg_map.py", line 134, in main get_read_ctg_map(rawread_dir=rawread_dir, pread_dir=pread_dir, asm_dir=asm_dir) File "/Genomics/grid3/users/preilly/bin/FALCON_0.7.0/FALCON-integrate/FALCON/falcon_kit/mains/get_read_ctg_map.py", line 96, in get_read_ctg_map wf.refreshTargets() # block File "/Genomics/grid3/users/preilly/bin/FALCON_0.7.0/FALCON-integrate/pypeFLOW/pypeflow/simple_pwatcher_bridge.py", line 210, in refreshTargets self._refreshTargets(updateFreq, exitOnFailure) File "/Genomics/grid3/users/preilly/bin/FALCON_0.7.0/FALCON-integrate/pypeFLOW/pypeflow/simple_pwatcher_bridge.py", line 277, in _refreshTargets raise Exception(msg) Exception: Some tasks are recently_done but not satisfied: set([Node(2-asm-falcon/read_maps/dump_rawread_ids)])`
python setup.py install
It is possible that your system install is not being updated. You might need --force
. Or, if you are using a virtualenv, just delete it and re-install everything.
I recommend using pip install --edit
. That uses "edit" mode, which means that only a kind of symbolic link (egg-info) is installed. Then, whenever you update pure python code, you do not necessarily need to re-install.
I have reproduced this locally. I see the problem:
$ cat 2-asm-falcon/read_maps/dump_rawread_ids/task.json
{
"inputs": {
"rawread_db": "/home/UNIXHOME/cdunn/repo/localhost/unzip/iter/0-rawreads/raw_reads.db"
},
"outputs": {
"rawread_id_file": "rawread_ids"
},
"parameters": {},
"python_function": "__main__.dump_rawread_ids"
}
__main__
needs to be an actual module. Working on it...
Awesome, thanks!
But that's not the only problem. I will push an update to FALCON-integrate in a few minutes...
1.8.5
fixes a couple other things too. Passes Quiver for me now.
Hi, First off, thanks a ton for all your work on FALCON and FALCON_unzip! I've been using FALCON for about a year now, and the results are great!
--Note-- This may be related to FALCON issue #499: https://github.com/PacificBiosciences/FALCON/issues/499
I've run into a series of issues trying to run FALCON_unzip on one of my assemblies (genome size estimate of between 380 and 430 Mb, but the p_ctg.fa comes out near 580-590 Mb, so I suspect haplotigs are being kept together in p_ctg.fa).
The first error involves a Python exception being thrown (an AttributeError) that the 'module' object does not have attribute 'get_read_ctg_map'. This seems to trace back to the
python -m get_read_ctg_map
call in track_reads.sh generated bytask_track_reads()
in unzip.py. I've also tried manually callingpython -m get_read-ctg_map
with and without deleting mypwatcher and other relevant directories and files, basically resetting the FALCON_unzip run to 0.What did succeed from there was running the DBshow commands manually, and running the (slightly-modified) contents of the
generate_read_to_ctg_map()
function.After that, I'm able to run
python -m rr_ctg_track
andpython -m pr_ctg_track
successfully, but then hit another error runningpython -m fetch_reads
complaining about input.fofn not existing as a file. I used a non-canonical input FOFN name, and included this in my fc_unzip.cfg, but it doesn't seem to be used intask_track_reads()
in unzip.py (should be easy to tack into the config[] dictionary inmain()
in unzip.py, assign it to a variable intask_track_reads()
, and then throw that after a--fofn
in the script definition for track_reads.sh).And a third note (not an error, but a suggestion), maybe pypeFLOW already does this, but it'd be good to avoid repeating get_read_ctg_map, rr_ctg_track, pr_ctg_track, and fetch_reads if their output files already exist. Since I manually ran the steps, as soon as I tried to re-run fc_unzip.py to get to alignment and phasing, FALCON_unzip tried to redo all the steps I had run manually, and failed again on get_read_ctg_map.
Version info: I tried first with FALCON_unzip c936cb9 on the p_ctg.fa generated by FALCON 0.4.0 (don't remember which commit), and then installed the newest via FALCON-integrate a few days ago and re-ran to see what might be happening. Newest versions used: FALCON-integrate: 3b36fd91d36149ec000ab35fbd1e8b7646f8e95c FALCON: 87b6262607a885979727592c5aa0dad459085f33 FALCON-make: 02ed9dedb0d5bc7bec75bfad8aa0a92629ad0099 FALCON_unzip: c936cb94a63cc763f9733d2f497ccc00a68ba02c pypeFLOW: bdd41dbc5ac4e4cf1dec1fa57e52850b399d85aa
Install was via PYTHONUSERBASE, I'm using SLURM, and I unset PYTHONPATH and source env.sh prior to running fc_unzip.py each time. I've tried with Python 2.7.3 and 2.7.12 to see if that makes a difference (maybe they fiddled with the handling of the -m flag?).
Potential source for the first error: Comparing the scripts getting called with
python -m
, it looks like get_read_ctg_map.py is the only one that uses any makePypeLocalFile() calls and runs functions via a PypeTask. Is this even necessary? It seems that get_read_ctg_map.py isn't terribly computationally intensive compared to the others, so adding extra Pype tasks seems an unnecessary additional complexity. Attached is an example (and apologies, but I forgot to remove the PypeProcWatcherWorkflow part). get_read_ctg_map.py.txtAgain, thanks for your time and all your great work!
Best regards, Patrick Reilly