Closed durubing-jn closed 2 years ago
@JNdurubing
pip install /dev_diectory
(see https://stackoverflow.com/questions/41535915/python-pip-install-from-local-dir). We ran into some parsing issues with the latest version of the InterProScan TSV output that has been addressed in the dev branch.9.genome_properties.tsv
file? Either post it to GitHub GISTs or a Google Drive/Dropbox link. Thanks for your reply!
I have installed the Pygenprop on the development branch.
If I understand correctly, this package needs some changes or updates, because some commands are abandoned by python (v 3.6):
/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/skbio/util/_testing.py:15: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead. import pandas.util.testing as pdt /home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/pygenprop/results.py:26: FutureWarning: 'pyarrow.default_serialization_context' is deprecated as of 2.0.0 and will be removed in a future version. Use pickle or the pyarrow IPC functionality instead. serialization_context = pa.default_serialization_context() INFO:__main__:Opening 9.genome_properties.tsv INFO:__main__:Only adding pathway annotations Traceback (most recent call last): File "/home/rstudio/miniconda2/envs/genomeproperties/bin/pygenprop", line 244, in <module> main(cli_args) File "/home/rstudio/miniconda2/envs/genomeproperties/bin/pygenprop", line 43, in main build_micromeda_file(genome_properties_tree, sanitized_input_file_paths, output_file_path, add_proteins) File "/home/rstudio/miniconda2/envs/genomeproperties/bin/pygenprop", line 107, in build_micromeda_file results = GenomePropertiesResults(*assignments_caches, properties_tree=genome_properties_tree) File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/pygenprop/results.py", line 47, in __init__ property_table, step_table = assignment.create_results_tables(properties_tree) File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/pygenprop/assign.py", line 342, in create_results_tables self.bootstrap_assignments(properties_tree) File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/pygenprop/assign.py", line 142, in bootstrap_assignments self.bootstrap_assignments_from_genome_property(properties_tree.root) File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/pygenprop/tree.py", line 43, in root genome_property = next(iter(self.genome_properties_dictionary.values())) StopIteration
Here is the 9.genome_properties.tsv (part, it is so large ~200M) and genomeProperties.txt https://gist.github.com/c7064dcdfc24158f830fd1d22d467535.git
@JNdurubing What version of the genome properties database file are you importing and how are you importing it?
https://github.com/Micromeda/pygenprop#acquiring-genome-properties-data
FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead. import pandas.util.testing as pdt /home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/pygenprop/results.py:26:
FutureWarning: 'pyarrow.default_serialization_context' is deprecated as of 2.0.0 and will be removed in a future version. Use pickle or the pyarrow IPC functionality instead.
If I understand correctly, this package needs some changes or updates, because some commands are abandoned by python (v 3.6):
These are warnings about future functionality changes about packages that pygenprop relies upon. Nothing to worry about as long as the dependency versions are locked in.
"/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/pygenprop/tree.py", line 43, in root genome_property = next(iter(self.genome_properties_dictionary.values())) StopIteration
@JNdurubing This looks like there's something wrong with your genome properties database file. For example, the file is empty or broken. See: https://github.com/Micromeda/pygenprop#acquiring-genome-properties-data
What file did you use? How did you import it?
I download properties database file using wget commands. I redownloaded this file (wget https://raw.githubusercontent.com/ebi-pf-team/genome-properties/master/flatfiles/genomeProperties.txt ~1.7M). Then run command: pygenprop build -d genomeProperties.txt -i 9.genome_properties.tsv -o temp. The current error is as follows:
`/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/skbio/util/_testing.py:15: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead. import pandas.util.testing as pdt /home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/pygenprop/results.py:26: FutureWarning: 'pyarrow.default_serialization_context' is deprecated as of 2.0.0 and will be removed in a future version. Use pickle or the pyarrow IPC functionality instead. serialization_context = pa.default_serialization_context() INFO:main:Opening 9.genome_properties.tsv INFO:main:Only adding pathway annotations INFO:main:Writing output Micromeda file to /home/rstudio/1_fermentated_food/7.genome_analysis/temp/temp Traceback (most recent call last): File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 3212, in _wrap_pool_connect return fn() File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 307, in connect return _ConnectionFairy._checkout(self) File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 767, in _checkout fairy = _ConnectionRecord.checkout(pool) File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 425, in checkout rec = pool._do_get() File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/sqlalchemy/pool/impl.py", line 256, in _do_get return self._create_connection() File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 253, in _create_connection return _ConnectionRecord(self) File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 368, in init self.connect() File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 611, in connect pool.logger.debug("Error on connect(): %s", e) File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 72, in exit with_traceback=exctb, File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 207, in raise raise exception File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/sqlalchemy/pool/base.py", line 605, in __connect connection = pool._invoke_creator(self) File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/sqlalchemy/engine/create.py", line 578, in connect return dialect.connect(*cargs, *cparams) File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 584, in connect return self.dbapi.connect(cargs, **cparams) sqlite3.OperationalError: unable to open database file
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/rstudio/miniconda2/envs/genomeproperties/bin/pygenprop", line 244, in
Maybe I need to put this file in a particular path?
@JNdurubing, I took some time running your test files and the version of the genome properties database you specified. It worked fine for me. So the problems were either caused by the installation or implementation on your end or a problem with the full-size IPR5 file. Based on the output above, the current problem is likely caused by either a permission issue or a locked SQLite file. It looks like swapping to the development branch fixed the first issue, but you may have encountered another error that is unrelated to the first.
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file (Background on this error at: https://sqlalche.me/e/14/e3q8)
So Micromeda files are actually SQLite3 database files, and pygenprop uses the SQLAlchemy library to write to these files. The above error indicates that Pygenprop was having trouble writing to the Micromeda file. Two reasons come to mind.
or
Closing because no response in months.
Hi, LeeBergstrand:
There was a error when using pygenprop.
Here was the codes: pygenprop build -d genomeProperties.txt -i 9.genome_properties.tsv -o temp
Here were the log: INFO:main:Opening 9.genome_properties.tsv INFO:main:Only adding pathway annotations Traceback (most recent call last): File "/home/rstudio/miniconda2/envs/genomeproperties/bin/pygenprop", line 244, in
main(cli_args)
File "/home/rstudio/miniconda2/envs/genomeproperties/bin/pygenprop", line 43, in main
build_micromeda_file(genome_properties_tree, sanitized_input_file_paths, output_file_path, add_proteins)
File "/home/rstudio/miniconda2/envs/genomeproperties/bin/pygenprop", line 107, in build_micromeda_file
results = GenomePropertiesResults(*assignments_caches, properties_tree=genome_properties_tree)
File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/pygenprop/results.py", line 42, in init
property_table, step_table = assignment.create_results_tables(properties_tree)
File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/pygenprop/assign.py", line 342, in create_results_tables
self.bootstrap_assignments(properties_tree)
File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/pygenprop/assign.py", line 142, in bootstrap_assignments
self.bootstrap_assignments_from_genome_property(properties_tree.root)
File "/home/rstudio/miniconda2/envs/genomeproperties/lib/python3.6/site-packages/pygenprop/tree.py", line 43, in root
genome_property = next(iter(self.genome_properties_dictionary.values()))
StopIteration
Can you help me how to solve it?
Thanks