Open GossypiumH opened 1 year ago
Does the error always happen on the same taxon, or is it somewhat random? For such a large query, I recommend using taxizedb
, which supports offline queries of downloaded databases.
The error is totally random. It can happen after 30 minutes of running or after 2 hours, I never passed the two hours cap though, it always bugs before.
My problem is that I can't use taxizedb because it only works with an input that is taxon IDs and I only have names.
Would this work for your purposes?
library(taxizedb)
classification(name2taxid(c('Arabidopsis thaliana', 'pig')))
#> $`3702`
#> name rank id
#> 1 cellular organisms no rank 131567
#> 2 Eukaryota superkingdom 2759
#> 3 Viridiplantae kingdom 33090
#> 4 Streptophyta phylum 35493
#> 5 Streptophytina subphylum 131221
#> 6 Embryophyta clade 3193
#> 7 Tracheophyta clade 58023
#> 8 Euphyllophyta clade 78536
#> 9 Spermatophyta clade 58024
#> 10 Magnoliopsida class 3398
#> 11 Mesangiospermae clade 1437183
#> 12 eudicotyledons clade 71240
#> 13 Gunneridae clade 91827
#> 14 Pentapetalae clade 1437201
#> 15 rosids clade 71275
#> 16 malvids clade 91836
#> 17 Brassicales order 3699
#> 18 Brassicaceae family 3700
#> 19 Camelineae tribe 980083
#> 20 Arabidopsis genus 3701
#> 21 Arabidopsis thaliana species 3702
#>
#> $`9823`
#> name rank id
#> 1 cellular organisms no rank 131567
#> 2 Eukaryota superkingdom 2759
#> 3 Opisthokonta clade 33154
#> 4 Metazoa kingdom 33208
#> 5 Eumetazoa clade 6072
#> 6 Bilateria clade 33213
#> 7 Deuterostomia clade 33511
#> 8 Chordata phylum 7711
#> 9 Craniata subphylum 89593
#> 10 Vertebrata clade 7742
#> 11 Gnathostomata clade 7776
#> 12 Teleostomi clade 117570
#> 13 Euteleostomi clade 117571
#> 14 Sarcopterygii superclass 8287
#> 15 Dipnotetrapodomorpha clade 1338369
#> 16 Tetrapoda clade 32523
#> 17 Amniota clade 32524
#> 18 Mammalia class 40674
#> 19 Theria clade 32525
#> 20 Eutheria clade 9347
#> 21 Boreoeutheria clade 1437010
#> 22 Laurasiatheria superorder 314145
#> 23 Artiodactyla order 91561
#> 24 Suina suborder 35497
#> 25 Suidae family 9821
#> 26 Sus genus 9822
#> 27 Sus scrofa species 9823
#>
#> attr(,"class")
#> [1] "classification"
#> attr(,"db")
#> [1] "ncbi"
Created on 2023-01-12 with reprex v2.0.2
Hum ! Tank you it should probably works !
Hi,
I have an issue with taxize. I am trying to retrieve the full taxonomy (from Kingdom to Order) of a dataset with 10k+ bacterias (10182 to be exact).
I have in input a dataframe with only one column with the species names (ex: Xenorhabdus sp.) so my script is very simple, as follow :
I tried to play with the value of "taxize_options(ncbi_sleep = 1.5)" but apparently it doesn't change the fact that I always have an API error as follow :
Retrieving data for taxon 'Janthinobacterium sp.'
Error: {"error":"error forwarding request","api-key":"192.108.190.140","type":"ip", "status":"ok"}
It happens at random after 1 or 2 hours of NCBI requests. I would very much like to have an idea of what is going on and if I did something wrong.
Thank you in advance,