Closed snubian closed 7 years ago
Very odd. The species Wijkia extenuata is one from your example. In the output of specieslist()
and indeed the output of species_info()
or taxinfo_download()
it has no kingdom:
> species_info(guid="e851274b-d043-4cef-ba01-2a7eb5abc80f")$classification$kingdom
NULL
> taxinfo_download("e851274b-d043-4cef-ba01-2a7eb5abc80f")$kingdom
[1] NA
But its page on the ALA web site (http://bie.ala.org.au/species/e851274b-d043-4cef-ba01-2a7eb5abc80f#classification) puts it in the kingdom Plantae. So I presume that it is being included in the returned result set because some part of the server database thinks it's in Plantae, but it comes back with empty kingdom because another part doesn't. I'll check in with the ALA devs.
Thank you once again for a prompt response.
I've been using the ALA's web services off and on for several years and still I have no precise understanding of what is happening under the hood. I've more or less accepted that this will remain one of life's mysteries. At least your package makes it much simpler :)
I'll pass this onto Doug, who is working with the names processing. I have a feeling the Kingdom is being inferred by the fact that the source is "AusMoss". The species page is calling a separate webservice to build the taxonomy, so the smarts is probably coming from that service. It looks like a bug that it is not being correctly placed in our taxonomy using the normal species service.
@raymondben can you lookup the actual webservice the plugin is calling for the ALA4R::specieslist
command, please? I have a feeling the fq=kingdom:Plantae
might be deprecated depending on which service its hitting... Its worth trying this instead: fq=rk_kingdom:Plantae
- species names fields changed a bit last year - full list of fields is available at http://biocache.ala.org.au/ws/index/fields. EDIT - looks like its hitting biocache.ala.org.au not bie.ala.org.au as I thought. In which case fq
looks OK.
The issue of blank kingdom appearing the species list CSV output is due to the missing kingdom data in the BIE (that Ben noted) but the fact that some occurrence records provide kingdom
in their original darwin core data. Thus the fq=kingdom:Plante
returns records but then the subsequent lookup against the BIE for each unique species in the occurrence facet results, provides an empty "kingdom" column. Should be fixed with better smarts for populating higher taxa in the BIE, which is an ongoing "improvement" we're working on.
Thanks @nickdos for tracking it down. Looks like we can safely assume that any returned record does satisfy the fq
filter (if one has been given)? Until those BIE improvements are done, I don't think it's possible to build a general workaround at the R end, but users can repopulate missing fields themselves if needed. I'll make a note in the function help.
@nickdos what does the q
parameter actually get matched against with that service? The API docs say "Query of the form field:value e.g. q=genus:Macropus or a free text search e.g. q=Macropus" but I think the free-text part of that is no longer correct. Previously this worked:
http://biocache.ala.org.au/ws/occurrences/facets/download?q=Macropus&facets=taxon_concept_lsid&lookup=true&count=true
but now gives no matches. A "genus:Macropus" style query still works:
http://biocache.ala.org.au/ws/occurrences/facets/download?q=genus%3AMacropus&facets=taxon_concept_lsid&lookup=true&count=true
Has something changed or have I misunderstood the usage?
@raymondben just yesterday we discovered that biocache searches are not working without a field specified - this is a bug that slipped into the last full re-index. We use SOLR and it allows you to set a default field, which is "text", so q=Macropus
is effectively q=text:Macropus
.
The original problem here (empty taxonomic fields) is an issue with the underlying ALA service, and is being addressed in https://github.com/AtlasOfLivingAustralia/bie-index/issues/134. Closing this one.
Thanks guys for chasing this up, much appreciated.
Just noticed this when using
specieslist()
, e.g.:So having specified
fq = "kingdom:Plantae"
we have 156 records with empty string for kingdom.In some ways I can see why it is informative to include these records with missing values, so I'm not sure if this behaviour is by design. But perhaps an option in the style of
na.rm
could be included?