Author Name: Matthew LaFave (Matthew LaFave)
Original Redmine Issue: 3431, https://redmine.open-bio.org/issues/3431
Original Date: 2013-04-29
Original Assignee: Bioperl Guts
I’ve been using the BioPerl module Bio::DB::EntrezGene to retrieve gene names based on Entrez gene IDs. It has worked fine for several months (and at least as recently as April 5th), but it hasn’t worked for the last few days. From what I can tell, nothing has changed about the module, so my impression is that NCBI may have changed the formatting of their records, and the module may need to be updated.
For example, here’s the sample code from the documentation for the module; it should work every time:
#!/usr/bin/perl
use strict;
use warnings;
use Bio::DB::EntrezGene;
my $db = Bio::DB::EntrezGene->new;
my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
while ( my $seq = $seqio->next_seq ) {
print "id is ", $seq->display_id, "\n";
}
exit;
…but recently, instead of returning a brief list of genes, it returns this:
Replacement list is longer than search list at /Library/Perl/5.12/Bio/Range.pm line 251.
UNIVERSAL->import is deprecated and will be removed in a future perl at /Library/Perl/5.12/Bio/Tree/TreeFunctionsI.pm line 94
Data Error: none conforming data found on line 1 in /var/folders/2f/55z0d46n3l10bq650j6svgw89rmqw1/T/mkguvw1MOO/VR86iPUDSJ!
first 20 (or till end of input) characters including the non-conforming data:
::= {
{
track-
at /Library/Perl/5.12/Bio/SeqIO/entrezgene.pm line 171
“Looking further it looks like the problem is that the data starts with Entrezgene-Set ::=and includes three items. BioPerl is expecting only Entrezgene ::=, and will not cope with sets. I guess BioPerl won’t handle this aspect of Entrez Gene data. If you look at the Bio::ASN1::EntrezGene module, the next_seq() subroutine insists onEntrezgene ::= at the start of the data. BioPerl won’t handle Entrez Gene sets.”
I’ve contacted NCBI to see if anything had changed, but I haven’t heard back yet. If you need any additional information, please let me know. Thanks!
Author Name: Matthew LaFave (Matthew LaFave) Original Redmine Issue: 3431, https://redmine.open-bio.org/issues/3431 Original Date: 2013-04-29 Original Assignee: Bioperl Guts
I’ve been using the BioPerl module Bio::DB::EntrezGene to retrieve gene names based on Entrez gene IDs. It has worked fine for several months (and at least as recently as April 5th), but it hasn’t worked for the last few days. From what I can tell, nothing has changed about the module, so my impression is that NCBI may have changed the formatting of their records, and the module may need to be updated.
For example, here’s the sample code from the documentation for the module; it should work every time:
…but recently, instead of returning a brief list of genes, it returns this:
An individual in the UK was able to reproduce the issue, so it’s unlikely that it’s something about my situation that’s causing this. His assessment on Stackoverflow (http://stackoverflow.com/questions/16199037/bioperl-module-biodbentrezgene-no-longer-working) was the following:
“Looking further it looks like the problem is that the data starts with Entrezgene-Set ::=and includes three items. BioPerl is expecting only Entrezgene ::=, and will not cope with sets. I guess BioPerl won’t handle this aspect of Entrez Gene data. If you look at the
Bio::ASN1::EntrezGene
module, thenext_seq()
subroutine insistsonEntrezgene ::=
at the start of the data. BioPerl won’t handle Entrez Gene sets.”I’ve contacted NCBI to see if anything had changed, but I haven’t heard back yet. If you need any additional information, please let me know. Thanks!