openvax / pyensembl

Python interface to access reference genome features (such as genes, transcripts, and exons) from Ensembl
Apache License 2.0
374 stars 65 forks source link

db attribute error from 'if datacache.db.db_table_exists(db, 'ensembl'):' in database.py #133

Closed Xerez13 closed 8 years ago

Xerez13 commented 8 years ago

I recently installed and ran pyensembl on my machine. The first time I called the module in an interactive session everything worked great.

from pyensembl import EnsemblRelease
data = EnsemblRelease(77)
gene_name = data.gene_names_at_locus(contig=11,position=101567059)
print gene_name

However, in any subsequent calls, typing in the same code as above, I received the following error:

Traceback (most recent call last): File "", line 1, in File "build/bdist.macosx-10.5-x86_64/egg/pyensembl/ensembl_release.py", line 1 65, in gene_names_at_locus File "build/bdist.macosx-10.5-x86_64/egg/pyensembl/database.py", line 226, in distinct_column_values_at_locus File "build/bdist.macosx-10.5-x86_64/egg/pyensembl/database.py", line 150, in column_values_at_locus File "build/bdist.macosx-10.5-x86_64/egg/pyensembl/database.py", line 120, in column_exists File "build/bdist.macosx-10.5-x86_64/egg/pyensembl/database.py", line 116, in columns File "build/bdist.macosx-10.5-x86_64/egg/pyensembl/database.py", line 111, in connection File "build/bdist.macosx-10.5-x86_64/egg/pyensembl/database.py", line 99, in _connect_or_create_database AttributeError: 'module' object has no attribute 'db'

When I went to the _connect_or_create_database function within database.py I found the following line was throwing the error:

if datacache.db.db_table_exists(db, 'ensembl'):

When I looked at the module datacache I found neither the db nor db_table_exists function:

dir(datacache) ['Cache', 'builtins', 'doc', 'file', 'loader', 'name', 'package', 'path', 'build_local_filename', 'build_path', 'cache', 'clear_cache', 'common', 'connect_if_correct_version', 'database', 'database_helpers', 'database_table', 'database_types', 'db_from_dataframe', 'db_from_dataframes', 'db_from_dataframes_with_absolute_path', 'download', 'ensure_dir', 'fasta', 'fetch_and_transform', 'fetch_csv_dataframe', 'fetch_csv_db', 'fetch_fasta_db', 'fetch_fasta_dict', 'fetch_file', 'get_data_dir']

My version of datacache was 0.4.16

The line which throws the error appears to be calling a functionality which is not present in datacache. The code initially works fine because if db_path does not exist the function directs to _create_database() which contains functions present in datacache. However, in subsequent calls the function attempts to execute the troublesome line and throws the error. I was able to get around this error by commenting out the line in question along with the return statement below it. I inserted a return statement below the db assignment. A copy of my adjustment is shown below.

    def _connect_or_create_database(self):
        """
        If database already exists, open a connection.
        Otherwise, create it.
        """
        db_path = self.local_db_path()
        if exists(db_path):
            db = sqlite3.connect(db_path)
            return db #edit
            # maybe file got created but not filled
            #if datacache.db.db_table_exists(db, 'ensembl'): #edit
            #    return db #edit
        return self._create_database()

I just thought I would raise this issue since this tool is quite cool and has great functionality, but this error might frustrate people if they aren't willing to investigate it.

iskandr commented 8 years ago

Hey @Xerez13,

Thanks for this detailed report. I'm not, however, able to reproduce the problem! In the most recent master branch of pyensembl, line 99 of database.py is:

            ['transcript_name'],

and the expression datacache.db.db_table_exists doesn't occur anywhere in the file.

Any chance you're using an older version of pyensembl with a more recent datacache?

Xerez13 commented 8 years ago

Hey @iskandr ,

I just checked and I am using pyensembl-0.5.4 which appears to be an older version of the package. My datacache version is 0.4.16. It is likely this error comes about exactly for the reason you suggest.

rohandavidg commented 8 years ago

i had the same issue..datacache-0.4.16-py2.7.egg, pyensembl-0.5.4-py2.7.egg

iskandr commented 8 years ago

Hey Rohan,

Can you try upgrading your version of PyEnsembl to the more recent 0.8.4?

On Sun, Feb 14, 2016 at 5:58 PM, Rohan Gnanaolivu notifications@github.com wrote:

i had the same issue..datacache-0.4.16-py2.7.egg, pyensembl-0.5.4-py2.7.egg

— Reply to this email directly or view it on GitHub https://github.com/hammerlab/pyensembl/issues/133#issuecomment-183998439 .

iskandr commented 8 years ago

I'm going to close since I suspect this error only occurs when an older PyEnsembl is installed with a newer datacache.