ropensci / rfishbase

R interface to the fishbase.org database
https://docs.ropensci.org/rfishbase
111 stars 40 forks source link

Updating sealifebase table with load_taxa #79

Closed Philipp-Neubauer closed 8 years ago

Philipp-Neubauer commented 8 years ago

Hi there,

thanks for an incredibly useful package. I'm trying to update the cached version of the sealifebase table (I'm wondering if there's more data beyond the 100k cached lines, which don't contain some of the species I'm looking for). But load_taxa(update=T) fails for me with FISHBASE_API set to sealifebase (it works with fishbase). Is there something I need to do beyond changing the FISHBASE_API?

thanks!

Philipp-Neubauer commented 8 years ago

PS heres the output:

load_taxa(update = T) NULL Warning message: In error_checks(parsed, resp = resp) : Mysql2::Error: Unknown column 'species.GenCode' in 'field list': SELECT species.SpecCode, species.Genus, species.Species, species.SpeciesRefNo, species.Author, species.FBname, species.SubFamily, species.FamCode, species.GenCode, species.SubGenCode, species.Remark, families.Family, families.Order, families.Class FROM species INNER JOIN families on species.FamCode = families.FamCode INNER JOIN genera on species.GenCode = genera.GenCode LIMIT 400000 for query http://fishbase.ropensci.org/sealifebase/taxa?family=&limit=400000

rBatt commented 8 years ago

I get something similar:

> load_taxa(update=T, server="http://fishbase.ropensci.org/sealifebase")
NULL
Warning message:
In error_checks(parsed, resp = resp) :
  Mysql2::Error: Unknown column 'species.GenCode' in 'field list': SELECT  species.SpecCode, species.Genus, species.Species, species.SpeciesRefNo, species.Author, species.FBname, species.SubFamily, species.FamCode, species.GenCode, species.SubGenCode, species.Remark, families.Family, families.Order, families.Class FROM `species` INNER JOIN families on species.FamCode = families.FamCode INNER JOIN genera on species.GenCode = genera.GenCode LIMIT 400000 for query http://fishbase.ropensci.org/sealifebase/taxa?family=&limit=400000
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.9.5 (Mavericks)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rfishbase_2.0.3

loaded via a namespace (and not attached):
 [1] httr_1.0.0      lazyeval_0.1.10 magrittr_1.5    R6_2.1.1        assertthat_0.1  parallel_3.2.2  DBI_0.3.1       tools_3.2.2     dplyr_0.4.3    
[10] curl_0.9.3      Rcpp_0.12.2     stringi_1.0-1   jsonlite_0.9.17 stringr_1.0.0   tidyr_0.3.1 

Although, I'm not sure if it's actually failing or not. It's just a warning. Not sure if it actually updated anything or not. Because not an error, traceback() isn't useful.

cboettig commented 8 years ago

Thanks for the bug reports. @sckott guessing this is another issue from the API re-write, looks like the taxa endpoint throws an error, e.g. http://fishbase.ropensci.org/sealifebase/taxa

sckott commented 8 years ago

@cboettig right, I'll try to have a look soon

sckott commented 8 years ago

the large limit by deafult in the load_taxa() function I'm pretty sure was causing a 502 error, which is most likely a requeset that's too large. that function hit the /taxa route on the API, which didn't have an enforced max limit value, but should have, at 5000. That has been fixed as of now.

see new issue #84