bridgedb / datasources

Repository with the BridgeDb data source.
Creative Commons Zero v1.0 Universal
4 stars 8 forks source link

[WIP] Transition to UniProtKB #43

Open egonw opened 1 year ago

egonw commented 11 months ago

I tested this code base with an old Ensembl 105 ID mapping database, and it loads fine (with old full name with the QC and reports as UniProtKB as full name in the output:

INFO: old database is EnsemblGenomes 49 (build: 20220621)
INFO: new database is EnsemblGenomes 49 (build: 20220621)
INFO: Number of ids in T (GeneOntology): 18841 (unchanged)
INFO: Number of ids in En (Ensembl): 61487 (unchanged)
INFO: Number of ids in Om (OMIM): 16002 (unchanged)
INFO: Number of ids in X (Affy): 946224 (unchanged)
INFO: Number of ids in H (HGNC): 39506 (unchanged)
INFO: Number of ids in Wg (WikiGenes): 25645 (unchanged)
INFO: Number of ids in Q (RefSeq): 248485 (unchanged)
INFO: Number of ids in Il (Illumina): 73313 (unchanged)
INFO: Number of ids in Uc (UCSC Genome Browser): 227239 (unchanged)
INFO: Number of ids in Pd (PDB): 47662 (unchanged)
INFO: Number of ids in L (Entrez Gene): 25645 (unchanged)
INFO: Number of ids in S (UniProtKB): 77949 (unchanged)
INFO: Number of ids in Hac (HGNC Accession number): 39506 (unchanged)
INFO: Number of ids in Mb (miRBase Sequence): 3692 (unchanged)
INFO: Number of ids in Ag (Agilent): 117826 (unchanged)
INFO: Number of ids in Rf (Rfam): 58 (unchanged)
INFO: Attribute provided: Type
INFO: Attribute provided: Description
INFO: Attribute provided: Symbol
INFO: Attribute provided: Chromosome
INFO: new size is 798 Mb (changed +0.0%)

@tabbassidaloii, when you have a Ensembl 109 release with this code base, I would like to compare that with the above file too. Then we compare two Derby files with different 'full names'. It should work, but would love to confirm that experimentally.

tabbassidaloii commented 11 months ago

Great, I will check that. But this file you checked is not from Ensembl itself but from plants or fungi. Ensembl v109 would not include it. I should compare it with Ensembl Plants or Fungi v56. Which species did you run for this test?