openvax / neoantigen-vaccine-pipeline

Bioinformatics pipeline for selecting patient-specific cancer neoantigen vaccines
Apache License 2.0
75 stars 25 forks source link

Ensembl 75 (human) and Ensembl 93 (mouse) always get installed with pyensembl #119

Closed iskandr closed 5 years ago

iskandr commented 5 years ago

It seems like regardless of the actual species/reference, these two lines cause pyensembl install --species human --release 75 and pyensembl install --species mouse --release 93 to run:

https://github.com/openvax/neoantigen-vaccine-pipeline/blob/60a6ffeea8aca600b822a0119db556cf3c39ac2b/pipeline/special_sauce.rules#L26

AhmedArslan commented 5 years ago

Trying to install pyensembl database (mouse) on my ssh server (python3) and need help with the following error:

2019-02-04 11:50:44,821 - datacache.database_helpers - WARNING - Failed to create tables ['start_codon', 'exon', 'gene', 'stop_codon', 'CDS', 'transcript'] in database /home/aarslan/.cache/pyensembl/GRCm38/ensembl93/Mus_musculus.GRCm38.93.gtf.db Traceback (most recent call last): File "/usr/bin/pyensembl", line 11, in load_entry_point('pyensembl==1.7.3', 'console_scripts', 'pyensembl')() File "/usr/lib/python3.4/site-packages/pyensembl/shell.py", line 261, in run genome.index(overwrite=args.overwrite) File "/usr/lib/python3.4/site-packages/pyensembl/genome.py", line 280, in index self.db.connect_or_create(overwrite=overwrite) File "/usr/lib/python3.4/site-packages/pyensembl/database.py", line 293, in connect_or_create return self.create(overwrite=overwrite) File "/usr/lib/python3.4/site-packages/pyensembl/database.py", line 253, in create version=DATABASE_SCHEMA_VERSION) File "/usr/lib/python3.4/site-packages/datacache/database_helpers.py", line 183, in db_from_dataframes_with_absolute_path version=version) File "/usr/lib/python3.4/site-packages/datacache/database_helpers.py", line 102, in _create_cached_db db.create(tables, version) File "/usr/lib/python3.4/site-packages/datacache/database.py", line 183, in create self._fill_table(table.name, table.rows) File "/usr/lib/python3.4/site-packages/datacache/database.py", line 164, in _fill_table self.connection.executemany(sql, rows) sqlite3.OperationalError: disk I/O error

iskandr commented 5 years ago

Hey @AhmedArslan, the last line (sqlite3.OperationalError: disk I/O error) makes me suspect that something lower level is wrong. Do you have write permissions to the PyEnsembl cache directory and does the HD have space?

AhmedArslan commented 5 years ago

Got it thanks, I am working with mouse genome and intalled the release (pyensembl install --release 93 --species mouse) but now I got the following error repeatedly:

import pyensembl ensembl = pyensembl.EnsemblRelease(release=93) genes = ensembl.genes_at_locus(contig="1", position=1000000) Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pyensembl/download_cache.py", line 299, in local_path_or_install_error overwrite=overwrite) File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pyensembl/download_cache.py", line 274, in download_or_copy_if_necessary overwrite) File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pyensembl/download_cache.py", line 224, in _download_if_necessary raise MissingRemoteFile(url) pyensembl.download_cache.MissingRemoteFile: ftp://ftp.ensembl.org/pub/release-93/gtf/homo_sapiens/Homo_sapiens.GRCh38.93.gtf.gz

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "", line 1, in File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pyensembl/genome.py", line 513, in genes_at_locus contig, position, end=end, strand=strand) File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pyensembl/genome.py", line 533, in gene_ids_at_locus return self.db.distinct_column_values_at_locus( File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pyensembl/genome.py", line 293, in db self._set_local_paths(download_if_missing=False, overwrite=False) File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pyensembl/genome.py", line 229, in _set_local_paths overwrite=overwrite) File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pyensembl/genome.py", line 193, in _get_gtf_path overwrite=overwrite) File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pyensembl/genome.py", line 186, in _get_cached_path overwrite=overwrite) File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pyensembl/download_cache.py", line 301, in local_path_or_install_error self._raise_missing_file_error({field_name: path_or_url}) File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pyensembl/download_cache.py", line 287, in _raise_missing_file_error raise ValueError(error_message) ValueError: Missing genome data file from ftp://ftp.ensembl.org/pub/release-93/gtf/homo_sapiens/Homo_sapiens.GRCh38.93.gtf.gz. Run pyensembl install --release 93 --species homo_sapiens

iskandr commented 5 years ago

pyensembl.EnsemblRelease(release=93) will be human by default, you have to specify species='mus_musculus'.