Closed kevinkle closed 5 years ago
We had a problem with this since PyPy's lookahead implementation for file parsing seems slow (not sure if they implement it at all) and we rely on it for parsing kmers from a file
Since we parse and pickle into Kmer objects, we could see about having snakemake do the parsing in cpython and graphing in pypy (if we can find some common conversion format) since the graphing the vast majority of the time
The SubgraphRef test has been failing on CircleCI when running via PyPy. When testing locally, we get some weird behaviour when trying to import SubgraphRef that only happens in PyPy.
kevin@panther ~/prairiedog> python
Python 3.6.1 (784b254d669919c872a505b807db8462b6140973, Apr 16 2019, 18:18:28)
[PyPy 7.1.1-beta0 with GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``as usual in pypy, the solution
appears completely disproportionate to the problem and instead we'll go for a
completely different simpler approach to the original problem''
>>>> from prairiedog.subgraph_ref import SubgraphRef
usage: [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: --help [cmd1 cmd2 ...]
or: --help-commands
or: cmd --help
error: no commands supplied
kevin@panther ~/prairiedog>
This problem is propagated into Snakemake as well:
kevin@panther ~/prairiedog> snakemake
SystemExit in line 16 of /home/kevin/prairiedog/Snakefile:
usage: snakemake [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: snakemake --help [cmd1 cmd2 ...]
or: snakemake --help-commands
or: snakemake cmd --help
error: no commands supplied
File "/home/kevin/prairiedog/Snakefile", line 16, in <module>
File "/home/kevin/prairiedog/prairiedog/subgraph_ref.py", line 11, in <module>
File "/home/kevin/prairiedog/prairiedog/lemon_graph.py", line 6, in <module>
File "/home/kevin/.pyenv/versions/pypy3.6-7.1.1/site-packages/LemonGraph-0.10.0-py3.6.egg/LemonGraph/__init__.py", line 5, in <module>
File "/home/kevin/prairiedog/setup.py", line 51, in <module>
File "/home/kevin/.pyenv/versions/pypy3.6-7.1.1/site-packages/setuptools/__init__.py", line 145, in setup
File "/home/kevin/.pyenv/versions/pypy3.6-7.1.1/lib-python/3/distutils/core.py", line 136, in setup
2019-07-03 10:14:33 panther snakemake.logging[12093] ERROR SystemExit in line 16 of /home/kevin/prairiedog/Snakefile:
usage: snakemake [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: snakemake --help [cmd1 cmd2 ...]
or: snakemake --help-commands
or: snakemake cmd --help
error: no commands supplied
File "/home/kevin/prairiedog/Snakefile", line 16, in <module>
File "/home/kevin/prairiedog/prairiedog/subgraph_ref.py", line 11, in <module>
File "/home/kevin/prairiedog/prairiedog/lemon_graph.py", line 6, in <module>
File "/home/kevin/.pyenv/versions/pypy3.6-7.1.1/site-packages/LemonGraph-0.10.0-py3.6.egg/LemonGraph/__init__.py", line 5, in <module>
File "/home/kevin/prairiedog/setup.py", line 51, in <module>
File "/home/kevin/.pyenv/versions/pypy3.6-7.1.1/site-packages/setuptools/__init__.py", line 145, in setup
File "/home/kevin/.pyenv/versions/pypy3.6-7.1.1/lib-python/3/distutils/core.py", line 136, in setup
I think its missing the ffi lib
try:
from ._lemongraph_cffi import ffi, lib
except ImportError:
from setup import fetch_external
fetch_external()
from LemonGraph.init.py where line 5 is from setup import fetch_external
The import LemonGraph
in PyPy works when were in the lemongraph/ submodule, but not in any other folder
"LemonGraph/__init__.py" 1095L, 37088C written
kevin@panther ~/p/lemongraph> python
Python 3.6.1 (784b254d669919c872a505b807db8462b6140973, Apr 16 2019, 18:18:28)
[PyPy 7.1.1-beta0 with GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``"there should be one and only one
obvious way to do it". PyPy variant: "there can be N half-buggy ways to do
it"''
>>>> import LemonGraph
In file included from /usr/include/assert.h:35,
from lib/lemongraph.c:10:
/usr/include/features.h:184:3: warning: #warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Wcpp]
# warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
^~~~~~~
In file included from /usr/include/errno.h:25,
from lib/db.c:5:
/usr/include/features.h:184:3: warning: #warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Wcpp]
# warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
^~~~~~~
>>>> exit()
kevin@panther ~/p/lemongraph> python
Python 3.6.1 (784b254d669919c872a505b807db8462b6140973, Apr 16 2019, 18:18:28)
[PyPy 7.1.1-beta0 with GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``pypy HIT generator''
>>>> import LemonGraph
>>>>
kevin@panther ~/p/lemongraph> python
Python 3.6.1 (784b254d669919c872a505b807db8462b6140973, Apr 16 2019, 18:18:28)
[PyPy 7.1.1-beta0 with GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``"it's likely temporary until
forever" arigo''
>>>> import LemonGraph
>>>> exit()
kevin@panther ~/prairiedog> python
Python 3.6.1 (784b254d669919c872a505b807db8462b6140973, Apr 16 2019, 18:18:28)
[PyPy 7.1.1-beta0 with GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``The problem is that for almost
any non-trivial program, it's not clear what 'correct' means.''
>>>> import LemonGraph
usage: [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: --help [cmd1 cmd2 ...]
or: --help-commands
or: cmd --help
error: no commands supplied
Fresh clone and install of lemongraph lets us import in other folders, but not in prairiedog
kevin@panther ~/lemongraph> python
Python 3.5.3 (928a4f70d3de7d17449456946154c5da6e600162, Feb 09 2019, 11:50:43)
[PyPy 7.0.0 with GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>> import LemonGraph
In file included from /usr/include/assert.h:35,
from lib/lemongraph.c:10:
/usr/include/features.h:184:3: warning: #warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Wcpp]
# warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
^~~~~~~
In file included from /usr/include/errno.h:25,
from lib/db.c:5:
/usr/include/features.h:184:3: warning: #warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Wcpp]
# warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
^~~~~~~
>>>> exit()
kevin@panther ~/lemongraph> cd ..
kevin@panther ~> python
Python 3.5.3 (928a4f70d3de7d17449456946154c5da6e600162, Feb 09 2019, 11:50:43)
[PyPy 7.0.0 with GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>> import LemonGraph
>>>> exit()
kevin@panther ~/prairiedog> python -m pip show LemonGraph
Name: LemonGraph
Version: 0.10.0
Summary: LemonGraph Database
Home-page: https://github.com/NationalSecurityAgency/lemongraph
Author: None
Author-email: None
License: UNKNOWN
Location: /home/kevin/.pyenv/versions/pypy3.6-7.1.1/site-packages/LemonGraph-0.10.0-py3.6.egg
Requires: cffi, lazy, msgpack, pysigset, python-dateutil, six
Required-by:
kevin@panther ~/prairiedog> cd ..
kevin@panther ~> python -m pip show LemonGraph
cpyext: missing slot wrapper tp_as_buffer.c_bf_getreadbuffer
RPython traceback:
File "pypy_interpreter.c", line 23470, in BuiltinCode_funcrun_obj
File "pypy_module_cpyext_6.c", line 17261, in wrap_del_call
Fatal RPython error: NotImplementedError
fish: “python -m pip show LemonGraph” terminated by signal SIGABRT (Abort)
kevin@panther ~> pyenv versions
system
* pypy3 (set by /home/kevin/.python-version)
pypy3.5-7.0.0
pypy3.5-7.0.0/envs/pypy3
pypy3.6-7.1.1
kevin@panther ~> cd lemongraph/
kevin@panther ~/lemongraph> pyenv versions
system
* pypy3 (set by /home/kevin/.python-version)
pypy3.5-7.0.0
pypy3.5-7.0.0/envs/pypy3
pypy3.6-7.1.1
kevin@panther ~/lemongraph>
Ah....
kevin@panther ~/prairiedog> pyenv versions
system
pypy3
pypy3.5-7.0.0
pypy3.5-7.0.0/envs/pypy3
* pypy3.6-7.1.1 (set by /home/kevin/prairiedog/.python-version)
Well, a fresh install of lemongraph fixed the import, but now getting another error. Will also have to note to cleanup lemongraphs folders before each python test set
kevin@panther ~/prairiedog> snakemake
Building DAG of jobs...
2019-07-03 11:02:15 panther snakemake.logging[20426] WARNING Building DAG of jobs...
Using shell: /bin/bash
2019-07-03 11:02:15 panther snakemake.logging[20426] WARNING Using shell: /bin/bash
Provided cores: 1
2019-07-03 11:02:15 panther snakemake.logging[20426] WARNING Provided cores: 1
Rules claiming more threads will be scaled down.
2019-07-03 11:02:15 panther snakemake.logging[20426] WARNING Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 all
2 kmers
1 pangenome
4
2019-07-03 11:02:15 panther snakemake.logging[20426] WARNING Job counts:
count jobs
1 all
2 kmers
1 pangenome
4
2019-07-03 11:02:15 panther snakemake.logging[20426] INFO
[Wed Jul 3 11:02:15 2019]
2019-07-03 11:02:15 panther snakemake.logging[20426] INFO [Wed Jul 3 11:02:15 2019]
rule kmers:
input: samples/SRR3295722.fasta
output: outputs/kmers/SRR3295722.pkl
jobid: 3
wildcards: sample=SRR3295722
2019-07-03 11:02:15 panther snakemake.logging[20426] INFO rule kmers:
input: samples/SRR3295722.fasta
output: outputs/kmers/SRR3295722.pkl
jobid: 3
wildcards: sample=SRR3295722
2019-07-03 11:02:15 panther snakemake.logging[20426] INFO
Job counts:
count jobs
1 kmers
1
2019-07-03 11:02:17 panther snakemake.logging[20473] WARNING Job counts:
count jobs
1 kmers
1
2019-07-03 11:02:17 panther prairiedog[20473] DEBUG Parsing Kmers for file samples/SRR3295722.fasta with K size 11 in pid 20473
2019-07-03 11:02:17 panther prairiedog[20473] DEBUG Seeing current working directory as: /home/kevin/prairiedog
cpyext: missing slot wrapper tp_as_buffer.c_bf_getreadbuffer
RPython traceback:
File "pypy_interpreter.c", line 23470, in BuiltinCode_funcrun_obj
File "pypy_module_cpyext_6.c", line 17261, in wrap_del_call
Fatal RPython error: NotImplementedError
Aborted
Shutting down, this might take some time.
2019-07-03 11:02:18 panther snakemake.logging[20426] WARNING Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
2019-07-03 11:02:18 panther snakemake.logging[20426] ERROR Exiting because a job execution failed. Look above for error message
Complete log: /home/kevin/prairiedog/.snakemake/log/2019-07-03T110213.663901.snakemake.log
2019-07-03 11:02:18 panther snakemake.logging[20426] WARNING Complete log: /home/kevin/prairiedog/.snakemake/log/2019-07-03T110213.663901.snakemake.log
Did a fresh install of lemongraph as described into the pypy3.6-7.1.1
and this fixed the Fatal RPython error: NotImplementedError
. Looks like there's some problem with pypy35-7.0.0
atm
Everything passed as of https://github.com/superphy/prairiedog/pull/117. Merged into master too as even if we go with Dgraph, it might be faster to do some of the tasks in PyPy
Per LemonGraph's reported performance improvements