naturalis / supersmart

Self-Updating Platform for the Estimation of Rates of Speciation, Migration And Relationships of Taxa
MIT License
17 stars 5 forks source link

dangerous NCBI queries when retrieving marker names #78

Closed hettling closed 9 years ago

hettling commented 9 years ago

It seems that when the NCBI servers have a busy day, querying for marker names (in BBmerge and Clademerge) can create quite a mess.

Even if Bioperl's Bio::DB::GenBank::get_Seq_by_acc returns $@ it seems to keep retrying, which can lead to some zombie threads. Even the output supermatrix can be affected. I haven't found any retry or timeout option for this function, but the documentation says get_seq_by_id is saver (however it could still lead to odd results).

get_seq_by_id is used as of commit bd53a946023f0a5eded0a85017cfc8f47ed6abd9.

Queries work fine now again, but we should keep this in mind when we see strange behaviour of BBmerge and Clademerge, so immediate action is not required, however this should be documented.

hettling commented 9 years ago

Just some simple code to test if the servers seem to work ok:

my $gb = Bio::DB::GenBank->new();

my @accesions = ('AF252983', 'JQ040924', 'DQ899931');

for my $acc (@accesions){
        my $seq = eval {$gb->get_Seq_by_acc($acc)};
        if ($@) {
                print $@;
        }
        else {
                print "Success  : " . $seq->id . "\n";
        }
}
rvosa commented 9 years ago

Ah, I noticed this as well (at least, a lot of child processes). They get reaped when the parent terminates, though, right?

Op Wed, 19 Aug 2015 om 17:15 schreef hettling notifications@github.com

Just some simple code to test if the servers seem to work ok:

my $gb = Bio::DB::GenBank->new();

my @accesions = ('AF252983', 'JQ040924', 'DQ899931');

for my $acc (@accesions){ my $seq = eval {$gb->get_Seq_by_acc($acc)}; if ($@) { print $@; } else { print "Success : " . $seq->id . "\n"; } }

— Reply to this email directly or view it on GitHub https://github.com/naturalis/supersmart/issues/78#issuecomment-132635033 .

hettling commented 9 years ago

Yes, so actually they are no Zombies

hettling commented 9 years ago

I think this was fixed in commit 1fc052a4c2cee5097954b4d5a6e13c0450d8ea50. At least I have not encountered uncontrollable retries and forking when the NCBI server is down.