ropensci / taxize

A taxonomic toolbelt for R
https://docs.ropensci.org/taxize
Other
270 stars 61 forks source link

gbif downstream #869

Open Andreas-Bio opened 3 years ago

Andreas-Bio commented 3 years ago

I am having trouble removing extinct plant families from this list: gbif_downstream(id=7707728, downto="family") I had to loop through the list using rgbif::name_lookup to get the desired information. I am not sure this is how it is to be intended. Maybe I am just not familiar with additional curl options I can pass to gbif_downstream. Is it maybe possible to exclude extinct plant taxa by default or by option to choose from? I don't think the vast majority of users is interested in fossil phylogeny.

sckott commented 3 years ago

Thanks for the issue. It's a complicated issue and will take a bit of time to drill down in to. Most likely the answer will be its not possible to simply pass a parameter - and that the only option will be to do what you're already doing. But we'll see.

sckott commented 3 years ago

@andzandz11 Looked at this again and there's no way I can see to get extinct status while also getting taxonomic children in one step. We could think about adding a parameter option to gbif_downstream to filter by extinct status. However, this would add time for sure. Would have to get all children as we do now, then after each step filter out an extinct taxa, which would require a call to name_lookup each time

Andreas-Bio commented 3 years ago

Thank you for looking into this.

I can not possibly imagine why this should the the default behaviour of GBIF. Maybe I am not aware of the overall situation, but as far as I know the vast majority of users are not interested in extinct taxa. But maybe it's just my scientific bubble.

It would be easier for the user to have this directly implemented, but on the other hand all the tools needed are already in this packge. Maybe for now the best solution would be to write a warning in the ?help, because I was totally caught by surprise by this?

I will open a ticket in the GBIF portal feedback system and ask for details, but given the number of open issues this could take a month or two.

Edit: On the other hand, I have no idea what the actual API call is.

sckott commented 3 years ago

You can also try open a discussion here https://discourse.gbif.org/ if you haven't yet.

It would be nice to have the option to filter by extinction status for sure - i agree extant taxa only seems like a good default setting

Andreas-Bio commented 3 years ago

Thank you for the suggestion I will try that. What is the API call you are using for gbif_downstream ?

sckott commented 3 years ago

There's two different places where we make a HTTP request to the GBIF API. And note that the function uses a while loop, so although theres two lines of code where HTTP requests happen, they can each happen many, many times

as an example gbif_downstream(id = 198, downto="genus")

  1. https://github.com/ropensci/taxize/blob/master/R/gbif_downstream.R#L89 here we do https://api.gbif.org/v1/species/198?limit=20
  2. https://github.com/ropensci/taxize/blob/master/R/gbif_downstream.R#L91 here we do https://api.gbif.org/v1/species/198/children?limit=100

The /children route is where we get the taxonomic children - AND where we can't limit to extant taxa