jorainer / ensembldb

This is the ensembldb development repository.
https://jorainer.github.io/ensembldb
33 stars 10 forks source link

Retrieving older annotations while matching API version #139

Open rcastelo opened 2 years ago

rcastelo commented 2 years ago

Hi, I'm tryting to build an EnsDb annotation package for an old version of the human annotations (74) for a purpose of trying to fully reproduce some result, but I'm encountering the following error:

library(ensembldb)
src <- system.file("scripts/generate-EnsDBs.R", package="ensembldb")
options(timeout=600)
createEnsDbForSpecies(ens_version=74, species="homo_sapiens", user="readonly", host="localhost", pass="readonly")
Going to process 1 species.
Processing species: homo_sapiens (1 of 1)
Downloading 76 files ... OK
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1067 (42000) at line 21: Invalid default value for 'created'
mysqlimport: [Warning] Using a password on the command line interface can be insecure.
mysqlimport: Error: 3948, Loading local data is disabled; this must be enabled on both the client and server sides, when using table: alt_allele
The submitted Ensembl version (74) does not match the version of the Ensembl API (107). Please configure the environment variable ENS to point to the correct API. at /Library/Frameworks/R.framework/Versions/4.2/Resources/library/ensembldb/perl/get_gene_transcript_exon_tables.pl line 109.
Error in fetchTablesFromEnsembl(version = ens_version, species = species,  : 
  Something went wrong! I'm missing some of the txt files the perl script should have generated.
In addition: Warning message:
In connection_release(conn@ptr) : There is a result object still in use.
The connection will be automatically released when it is closed

I've installed the Ensembl Perl API and mysql with a 'readonly' user with password 'readonly' and in principle everything is in place, but I don't know to get around this mismatch between the version of Ensembl human annotations that I want to fetch (74) and the fact that the current Ensembl API version is 107. I already tried to set an environment variable ENS=74, but didn't make any difference. Any hint will be appreciated.

By the way, in the vignette from where I read about using this function createEnsDbForSpecies(), the command to fetch the code for that function in the vignette is:

scr <- system.file("scripts/generate-EnsDbs.R", package = "ensembldb")
source(scr)

while it should be:

scr <- system.file("scripts/generate-EnsDBs.R", package = "ensembldb")
source(scr)
jorainer commented 2 years ago

Hi Roberto!

I assume you installed the Ensembl perl API using github (i.e. from https://github.com/Ensembl/ensembl)? If so, you need to check out the release 74 of the source code:

## within the Ensembl API folder:
git checkout release/74

Then it should work. The perl script has a first check .

Since installation of the Ensembl Perl API and all other required perl libraries is quite tricky (at least in my opinion) it could be that you run into other problems later. Just let me know for which species you need annotations and I'll create them for you.

And thanks for the bug in the vignette!

jorainer commented 2 years ago

Actually @rcastelo , in which vignette did you find the scr <- system.file("scripts/generate-EnsDbs.R", package = "ensembldb") line? I could not find it.

rcastelo commented 2 years ago

Hi Johannes,

Thanks for the hint, it looks like I have been able to correct that problem, but now I've stumbled into this other error:

createEnsDbForSpecies(ens_version=74, species="homo_sapiens", user="readonly", host="localhost", pass="readonly")
Going to process 1 species.
Processing species: homo_sapiens (1 of 1)
Downloading 76 files ... OK
Error: Can't create database 'homo_sapiens_core_74_37'; database exists [1007]

which looks like there is some stale data in the mysql database that precludes the pipeline going forward. Do you know how can I clean up and have fresh start?

The vignette where I found the typo was the one from the AHEnsDbs annotation package, concretely this one.

rcastelo commented 2 years ago

Hi again,

ok, so I figured out that I can simply enter in the MySQL database and remove that stale database homo_sapiens_core_74_37. This seems to get me to the next base, which is the following error:

createEnsDbForSpecies(ens_version=74, species="homo_sapiens", user="readonly", host="localhost", pass="readonly")
Going to process 1 species.
Processing species: homo_sapiens (1 of 1)
Downloading 76 files ... OK
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1067 (42000) at line 21: Invalid default value for 'created'
mysqlimport: [Warning] Using a password on the command line interface can be insecure.
mysqlimport: Error: 3948, Loading local data is disabled; this must be enabled on both the client and server sides, when using table: alt_allele
Connecting to localhost at port -P
DBI connect('host=localhost;port=-P','readonly',...) failed: Access denied for user 'readonly'@'localhost' (using password: NO) at /Users/robert/Soft/Ensembl/ensembl/modules/Bio/EnsEMBL/Registry.pm line 1622.
Argument "-P" isn't numeric in sprintf at /Users/robert/Soft/Ensembl/ensembl/modules/Bio/EnsEMBL/Registry.pm line 1622.

-------------------- EXCEPTION --------------------
MSG: Cannot connect to the Ensembl MySQL server at localhost:0; check your settings & DBI error message: Access denied for user 'readonly'@'localhost' (using password: NO)
STACK Bio::EnsEMBL::Registry::load_registry_from_db /Users/robert/Soft/Ensembl/ensembl/modules/Bio/EnsEMBL/Registry.pm:1622
STACK toplevel /Library/Frameworks/R.framework/Versions/4.2/Resources/library/ensembldb/perl/get_gene_transcript_exon_tables.pl:118
Date (localtime)    = Tue Jul 26 18:03:01 2022
Ensembl API version = 74
---------------------------------------------------
Error in fetchTablesFromEnsembl(version = ens_version, species = species,  : 
  Something went wrong! I'm missing some of the txt files the perl script should have generated.
In addition: Warning message:
In connection_release(conn@ptr) : There is a result object still in use.
The connection will be automatically released when it is closed

In principle the user 'readonly' has password 'readonly' and I can enter into the MySQL server with mysql -u readonly -p and giving readonly as password. So, I'm a bit at loss here. Any hint?

jorainer commented 2 years ago

Can you please check if your database contains any data? To me it seems that the data import failed.

So, maybe connect to the database and do a quick

show tables;
select * from alt_allele limit 3;

but I guess that the table will be empty. According to this error message:

mysqlimport: Error: 3948, Loading local data is disabled; this must be enabled on both the client and server sides, when using table: alt_allele

restoring a mysql database from txt files is disabled for mysqlimport (which is actually the default for security reasons). You would need to add the following line to your mysql config file:

local-infile=1

It depends a bit on your system, but for me on linux I had to add this to /etc/my.cnf.d/server.cnf (below the [mysqld] line) and to /etc/my.cnf.d/client.cnf (below the [client] line).

The other thing I don't quite undestand is this line:

Connecting to localhost at port -P
DBI connect('host=localhost;port=-P','readonly',...) failed:

so, connection fails because it does not recognize the -P parameter for the password and treats that as the port parameter (-p). I'll have a look into the perl script I'm using whether there's the problem...

jorainer commented 2 years ago

Maybe specifying the port you're running mysql could help: try adding port = 5306 (assuming mysql listens on that port) to the createEnsDbForSpecies might help.

And again, if you don't want to spend that much time fiddling around with the rather tricky mysql/perl/Ensembl API setup you can also just tell me for what species and releases you want to have EnsDb databases and I'll create them quickly.

rcastelo commented 2 years ago

Thanks, the suggestion about the port worked (the port in my case was 3306), but I still get the error about "loading local data is disabled":

> createEnsDbForSpecies(ens_version=74, species="homo_sapiens", user="readonly", host="localhost", pass="readonly", port=3306)
Going to process 1 species.
Processing species: homo_sapiens (1 of 1)
Downloading 76 files ... OK
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1067 (42000) at line 21: Invalid default value for 'created'
mysqlimport: [Warning] Using a password on the command line interface can be insecure.
mysqlimport: Error: 3948, Loading local data is disabled; this must be enabled on both the client and server sides, when using table: alt_allele
Connecting to localhost at port 3306
DBD::mysql::st execute failed: Table 'homo_sapiens_core_74_37.meta' doesn't exist at /Users/robert/Soft/Ensembl/ensembl/modules/Bio/EnsEMBL/DBSQL/BaseMetaContainer.pm line 139.
DBD::mysql::st execute failed: Table 'homo_sapiens_core_74_37.meta' doesn't exist at /Users/robert/Soft/Ensembl/ensembl/modules/Bio/EnsEMBL/DBSQL/BaseMetaContainer.pm line 139.
Error in fetchTablesFromEnsembl(version = ens_version, species = species,  : 
  Something went wrong! I'm missing some of the txt files the perl script should have generated.
In addition: Warning message:
In connection_release(conn@ptr) : There is a result object still in use.
The connection will be automatically released when it is closed

This is the configuration file with the added line local-infile=1:

ws136639:~ robert$ cat /usr/local/etc/my.cnf
# Default Homebrew MySQL server config
[mysqld]
# Only allow connections from localhost
bind-address = 127.0.0.1
mysqlx-bind-address = 127.0.0.1
## from https://github.com/jorainer/ensembldb/issues/139
local-infile=1

and now complains about another table called homo_sapiens_core_74_37.meta. I know that you're willing to build this package, but I encounter often the situation in which I want to reproduce the results for which people used a specific Ensembl version of the annotations and would like to be able to generate those by myself. Of course, if we keep hitting errors, at some point I'll give up X-P

jorainer commented 2 years ago

Maybe try to restart the mysql server. That might be needed to ensure the local-infile is recognized. you could then check if it's correctly set by connecting to your database and calling:

show global variables like 'local_infile';

Note that also the client needs to have local-infile=1, so you will need to also set that for the mysql client. I don't know where in your system the mysql client conf is stored, but maybe you can simply add:

[client]
local-infile=1

Maybe also have a look at the mysql documentation if the error persists. I hope it does not have to do something with incompatible mysql server versions - note that the Ensembl Perl API requires a quite old Perl and MySQL. On my system (macOS) I'm using mariadb 10.8.3

rcastelo commented 2 years ago

Hi, restarting the mysql server did the job, however, adding

[client]
local-infile=1

to the mysql configuration file at /usr/local/etc/my.cnf triggered the following error:

ERROR 1067 (42000) at line 21: Invalid default value for 'created'
mysqlimport: [ERROR] unknown variable 'local-infile=1'.

so I simply removed this bit and the pipeline went forward till this next error:

createEnsDbForSpecies(ens_version=74, species="homo_sapiens", user="readonly", host="localhost", pass="readonly", port=3306)

Going to process 1 species.
Processing species: homo_sapiens (1 of 1)
Downloading 76 files ... OK
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1067 (42000) at line 21: Invalid default value for 'created'
mysqlimport: [Warning] Using a password on the command line interface can be insecure.
homo_sapiens_core_74_37.alt_allele: Records: 7553  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.alt_allele_attrib: Records: 6564  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.alt_allele_group: Records: 2853  Deleted: 0  Skipped: 0  Warnings: 0
mysqlimport: Error: 1146, Table 'homo_sapiens_core_74_37.analysis' doesn't exist, when using table: analysis
Connecting to localhost at port 3306
DBD::mysql::st execute failed: Table 'homo_sapiens_core_74_37.meta' doesn't exist at /Users/robert/Soft/Ensembl/ensembl/modules/Bio/EnsEMBL/DBSQL/BaseMetaContainer.pm line 139.
DBD::mysql::st execute failed: Table 'homo_sapiens_core_74_37.meta' doesn't exist at /Users/robert/Soft/Ensembl/ensembl/modules/Bio/EnsEMBL/DBSQL/BaseMetaContainer.pm line 139.
Error in fetchTablesFromEnsembl(version = ens_version, species = species,  : 
  Something went wrong! I'm missing some of the txt files the perl script should have generated.
In addition: Warning message:
In connection_release(conn@ptr) : There is a result object still in use.
The connection will be automatically released when it is closed

It seems that it has started processing some tables (homo_sapiens_core_74_37.alt_allele, etc.), but it complains that it does not find the table homo_sapiens_core_74_37.analysis. Any hint?

jorainer commented 2 years ago

hm, this is strange. Did you remove the old homo_sapiens_... database before starting again? Alternatively there might be some issue while creating the tables. What puzzles me is the ERROR 1067 (..., I've never seen that error before but googling around seems that there could be some solutions to this. What my code actually does is:

To me it seems that something happens at the second step and that not all tables are created (something incompatible with your local mysql server in line 21 of the sql file?).

You could check which tables were created by connecting to your database

mysql -h localhost -user readonly -p homo_sapiens_core_74_37

and then list all tables

show tables;

when there are only few tables (e.g. just the alt_allele* tables) then there might be an error creating all required tables...

rcastelo commented 2 years ago

I think I'm almost there!! :)

I do remove the old homo_sapiens_... database at each new attempt because otherwise I get an error that the database cannot be created because it already exists. What you say is true, only the alt_allele* tables are created:

$ mysql -h localhost -user readonly -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 15
Server version: 8.0.29 Homebrew

Copyright (c) 2000, 2022, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+-------------------------+
| Database                |
+-------------------------+
| homo_sapiens_core_74_37 |
| information_schema      |
| mysql                   |
| performance_schema      |
| sys                     |
+-------------------------+
5 rows in set (0.00 sec)
mysql> use homo_sapiens_core_74_37;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+-----------------------------------+
| Tables_in_homo_sapiens_core_74_37 |
+-----------------------------------+
| alt_allele                        |
| alt_allele_attrib                 |
| alt_allele_group                  |
+-----------------------------------+
3 rows in set (0.00 sec)

so the question is why does it choke at these errors:

ERROR 1067 (42000) at line 21: Invalid default value for 'created'
[...]
mysqlimport: Error: 1146, Table 'homo_sapiens_core_74_37.analysis' doesn't exist, when using table: analysis

I recalled somehow your words above where you said "I hope it does not have to do something with incompatible mysql server versions - note that the Ensembl Perl API requires a quite old Perl and MySQL. On my system (macOS) I'm using mariadb 10.8.3". So, I decided to uninstall MySQL completely and installed, via HomeBrew (I'm also on macOS), MariaDB and got actually the same version you mention (10.8.3). Next to brew remove mysql I had to manually remove /usr/local/etc/my.cnf, otherwise the MariaDB server would not start and reinstall the Perl module DBD::mysql. I created the user 'readonly' on the MariaDB database and checked that the listening port was still 3306. Then, I typed again the commands and the database seemed finally to load all the tables, but then a Perl error popped up at the end:

library(ensembldb)
src <- system.file("scripts/generate-EnsDBs.R", package="ensembldb")
source(src)
createEnsDbForSpecies(ens_version=74, species="homo_sapiens", user="readonly", host="localhost", pass="readonly", port=3306)
Going to process 1 species.
Processing species: homo_sapiens (1 of 1)
Downloading 76 files ... OK
WARNING: Forcing protocol to  TCP  due to option specification. Please explicitly state intended protocol.
WARNING: Forcing protocol to  TCP  due to option specification. Please explicitly state intended protocol.
homo_sapiens_core_74_37.alt_allele: Records: 7553  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.alt_allele_attrib: Records: 6564  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.alt_allele_group: Records: 2853  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.analysis: Records: 81  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.analysis_description: Records: 80  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.assembly: Records: 105684  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.assembly_exception: Records: 215  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.associated_group: Records: 0  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.associated_xref: Records: 0  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.attrib_type: Records: 257  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.coord_system: Records: 8  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.data_file: Records: 0  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.density_feature: Records: 21654  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.density_type: Records: 8  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.dependent_xref: Records: 3238970  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.ditag: Records: 3598656  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.ditag_feature: Records: 1221256  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.dna: Records: 27953  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.dna_align_feature: Records: 29511919  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.exon: Records: 745265  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.exon_transcript: Records: 1313912  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.external_db: Records: 580  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.external_synonym: Records: 184399  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.gene: Records: 64078  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.gene_archive: Records: 191840  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.gene_attrib: Records: 116194  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.identity_xref: Records: 314464  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.interpro: Records: 27636  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.intron_supporting_evidence: Records: 0  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.karyotype: Records: 1109  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.map: Records: 12  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.mapping_session: Records: 44  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.mapping_set: Records: 13  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.marker: Records: 299818  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.marker_feature: Records: 300345  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.marker_map_location: Records: 164580  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.marker_synonym: Records: 732428  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.meta: Records: 183  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.meta_coord: Records: 33  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.misc_attrib: Records: 411093  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.misc_feature: Records: 89296  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.misc_feature_misc_set: Records: 89296  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.misc_set: Records: 16  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.object_xref: Records: 5218757  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.ontology_xref: Records: 809324  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.operon: Records: 0  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.operon_transcript: Records: 0  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.operon_transcript_gene: Records: 0  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.peptide_archive: Records: 152887  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.prediction_exon: Records: 345489  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.prediction_transcript: Records: 48597  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.protein_align_feature: Records: 19610111  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.protein_feature: Records: 989043  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.repeat_consensus: Records: 595230  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.repeat_feature: Records: 9153294  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.seq_region: Records: 56888  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.seq_region_attrib: Records: 3001  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.seq_region_mapping: Records: 2922  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.seq_region_synonym: Records: 237  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.simple_feature: Records: 182859  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.splicing_event: Records: 350002  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.splicing_event_feature: Records: 1109960  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.splicing_transcript_pair: Records: 867704  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.stable_id_event: Records: 1507011  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.supporting_feature: Records: 4660307  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.transcript: Records: 215621  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.transcript_attrib: Records: 958421  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.transcript_intron_supporting_evidence: Records: 0  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.transcript_supporting_feature: Records: 97425  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.translation: Records: 105213  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.translation_attrib: Records: 526419  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.unmapped_object: Records: 675796  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.unmapped_reason: Records: 51  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.xref: Records: 2747025  Deleted: 0  Skipped: 0  Warnings: 0
Connecting to localhost at port 3306
# get_gene_transcript_exon_tables.pl version 0.3.7:
Retrieve gene models for Ensembl version 74, species homo_sapiens from Ensembl database at host: localhost
Start fetching data
Can't locate object method "stable_id_version" via package "Bio::EnsEMBL::Gene" at /Library/Frameworks/R.framework/Versions/4.2/Resources/library/ensembldb/perl/get_gene_transcript_exon_tables.pl line 231.
Error in fetchTablesFromEnsembl(version = ens_version, species = species,  : 
  Something went wrong! I'm missing some of the txt files the perl script should have generated.
In addition: Warning message:
In connection_release(conn@ptr) : There is a result object still in use.
The connection will be automatically released when it is closed

so, I think I'm very close to make it. Please let me know if the Perl error tells you anything.

jorainer commented 2 years ago

Oh - indeed, that might be something I hadn't anticipated - could be that some of the variables I'm extracting are not available or defined in older Ensembl version. I'll update the perl code to avoid calling that method for older Ensembl releases.

Please install the package again from github (BiocManager::install("jorainer/ensembldb")) and try again - could be that we stumble across other Perl methods that were not defined in that old Ensembl version - let's fix them sequentially.

rcastelo commented 2 years ago

Hi, with this new update of the ensembldb package it has finally worked!!

[...]
homo_sapiens_core_74_37.unmapped_object: Records: 675796  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.unmapped_reason: Records: 51  Deleted: 0  Skipped: 0  Warnings: 0
homo_sapiens_core_74_37.xref: Records: 2747025  Deleted: 0  Skipped: 0  Warnings: 0
Connecting to localhost at port 3306
# get_gene_transcript_exon_tables.pl version 0.3.7:
Retrieve gene models for Ensembl version 74, species homo_sapiens from Ensembl database at host: localhost
Start fetching data
processed 2000 genes
processed 4000 genes
processed 6000 genes
processed 8000 genes
processed 10000 genes
processed 12000 genes
processed 14000 genes
processed 16000 genes
processed 18000 genes
processed 20000 genes
processed 22000 genes
processed 24000 genes
processed 26000 genes
processed 28000 genes
processed 30000 genes
processed 32000 genes
processed 34000 genes
processed 36000 genes
processed 38000 genes
processed 40000 genes
processed 42000 genes
processed 44000 genes
processed 46000 genes
processed 48000 genes
processed 50000 genes
processed 52000 genes
processed 54000 genes
processed 56000 genes
processed 58000 genes
processed 60000 genes
processed 62000 genes
processed 64000 genes
Processing 'chromosome' table ... OK
Processing 'gene' table ... OK
Processing 'entrezgene' table ... OK
Processing 'trancript' table ... OK
Processing 'exon' table ... OK
Processing 'tx2exon' table ... OK
Processing 'protein' table ... OK
Processing 'uniprot' table ... OK
Processing 'protein_domain' table ... OK
Creating indices ... OK
Checking validity of the database ... OK
Done with species: homo_sapiens, 0 left.
Warning messages:
1: In connection_release(conn@ptr) :
  There is a result object still in use.
The connection will be automatically released when it is closed
2: In connection_release(conn@ptr) :
  There is a result object still in use.
The connection will be automatically released when it is closed
> makeEnsembldbPackage(ensdb="EnsDb.Hsapiens.v74.sqlite", version="1.0.0", maintainer="Robert Castelo <robert.castelo@upf.edu>", author="Robert Castelo")
Creating package in ./EnsDb.Hsapiens.v74 
[1] TRUE

By the way, do you know of any website where one can see the different number of human Ensembl genes per release version of the Ensembl database? (this would allow one to anticipate what version of Ensembl was used to generate a table of counts based on Ensembl gene identifiers, as long as the number of rows/ensembl genes is specific to an Ensembl version).

Thanks again!

jorainer commented 2 years ago

Wow! Congratulations!!!

Regarding your question - no, unfortunately I don't know if there exists such a page. Worst case, get EnsDb databases for each Ensembl release from AnnotationHub and get the gene counts from there ;)