AtlasOfLivingAustralia / biocache-service

Occurrence & mapping webservices
https://biocache-ws.ala.org.au/ws/
Other
9 stars 26 forks source link

Investigate how `species_list_uid:dr1234` handles unmatched names in a list #721

Closed nickdos closed 2 years ago

nickdos commented 2 years ago

Species lists that are marked as "authorative" get indexed in SOLR so that the query syntax q=species_list_uid:dr1234 returns records for species in the list at https://lists.ala.org.au/speciesListItem/list/dr1234 (made-up list).

I suspect that under the hoods biocache is using the matched name to link records to the list. Many lists have unmatched taxa in them and so I'm wondering if the code is smart enough to index against the raw_taxon_name for those unmatched names or are records only found for the matched names (unmatched names ignored).

Ideally, unmatched names will be checked against the raw_taxon_name.

nickdos commented 2 years ago

https://github.com/gbif/pipelines/blob/dev/livingatlas/pipelines/src/main/java/au/org/ala/pipelines/beam/SpeciesListPipeline.java

nickdos commented 2 years ago

Dave thinks unmatched taxa are ignored.

nickdos commented 2 years ago

https://lists.ala.org.au/speciesListItem/list/dr914 contains a single unmatched taxon: Helvella chinensis.

This name is found via a raw_name search in biocache:

https://biocache.ala.org.au/occurrences/search?q=raw_name%3A%22Helvella+chinensis%22

19 records which matches only to the genus Helvella.

Clicking on the "View records" buton for this list:

https://biocache.ala.org.au/occurrences/search?q=species_list_uid:dr914

does not appear to include any records for either Helvella as a matched name or Helvella chinensis as a "Scientific name (unmatched)".

I think this proves that unmatched names are not handled.