ropensci / rfishbase

R interface to the fishbase.org database
https://docs.ropensci.org/rfishbase
111 stars 40 forks source link

Increasing data accessibility #133

Closed dbarneche closed 6 years ago

dbarneche commented 6 years ago

Hello all,

I was wondering if the existing package API offers access to all information contained within FishBase.

For example, I couldn't access the Relative Gill Area Studies, nor the trophic level (TL) and aspect ratio (AR) data. Couldn't find that info after using httr::content(heartbeat()) either. These are important traits that a lot of people would benefit from having easy access to.

The latter two (TL and AR) are respectively found at the species-specific pages, with TL being available at the bottom of the page, and AR at the morphometrics tab. Examples for Abramis brama can be found here for TL and here for AR.

I guess my question is: Is this something that FishBase will have to make available from their end? Or is there a way to enhance the capacity of the existing API from rfishbase's end?

We're starting the #ozunconf tomorrow here in Melbourne, and I'd be keen to fork the repo and expand its current capabilities (as per my questions above) if possible and if that's of general interest to others. Just need some pointers.

Cheers

sckott commented 6 years ago

👋 @dbarneche sorry about delay. We do not expose all tables in the database dump we get from Fishbase. having a look at those variables

sckott commented 6 years ago

if you do

curl 'https://fishbase.ropensci.org/listfields/' | jq '.data[] | select(.column_name == "trophic_level")'

you can see

{
  "table_name": "matrix",
  "column_name": "trophic_level"
}

Then you can do

curl 'https://fishbase.ropensci.org/matrix?SpecCode=268' | jq '.data[].trophic_level'

Found aspect ratio with

curl 'https://fishbase.ropensci.org/morphmet?SpecCode=268' | jq '.data[].AspectRatio'

cboettig commented 6 years ago

@sckott love the examples with jq, but this suggests maybe we should improve how listfields R function is returning this data to make it easier to discover?

sckott commented 6 years ago

using the package

list_fields(fields="trophic_level", implemented_only=FALSE)
#> # A tibble: 1 x 2
#>   table_name   column_name
#>        <chr>         <chr>
#> 1     matrix trophic_level

list_fields(fields="AspectRatio", implemented_only=FALSE)
#> # A tibble: 2 x 2
#>   table_name column_name
#> *      <chr>       <chr>
#> 1   morphmet AspectRatio
#> 2   swimming AspectRatio

should we improve on that?

dbarneche commented 6 years ago

Hi @sckott , thanks very much for this. Is there a way though to extract a full table with either TL and Aspect ratio for all species at once? As in the oxygen table rfishbase::oxygen()?

sckott commented 6 years ago

@dbarneche I think what you are after is the fields parameter? e.g, rfishbase::swimming(fields="AspectRatio", limit = 500)

note that matrix route on the API is not exposed as a function in rfishbase - we may consider adding it as a function

dbarneche commented 6 years ago

Hi @sckott, thanks very much for your reply.

I guess that's almost there. The only catch, though, is that for some species there are multiple measurements of aspect ratio (e.g. for Abramis brama here), and this is not reflected in rfishbase::swimming(species = 'Abramis brama', fields="AspectRatio", limit = 500) output. In fact, it returns no matches, which is odd?

sckott commented 6 years ago

It may be a database version issue. hoping to get updated database version very soon, early next week hopefully

dbarneche commented 6 years ago

Awesome, thanks @sckott

dbarneche commented 6 years ago

Hi @sckott, is the new database version already available through the package? I tried the commands above, but still got the same results.

sckott commented 6 years ago

@dbarneche yes, we are serving the most up to date data by default now. The up to date data is from March 2017. you can see database versions at https://fishbase.ropensci.org/versions/ We are waiting on a new database dump from Fishbase that we hope to get soon. we'll add ability to toggle between DB versions to this pkg soon

what about rfishbase::morphometrics(species = 'Abramis brama', limit = 500) ?

dbarneche commented 6 years ago

Perfect! That works! Thanks very much!