AtlasOfLivingAustralia / la-pipelines

Living Atlas Pipelines extensions
3 stars 4 forks source link

`webportal/legend` values not consistent with `facets` output #462

Closed nickdos closed 3 years ago

nickdos commented 3 years ago

Reported by a user https://support.ehelp.edu.au/a/tickets/110958 (see #460 for screenshots).

Over the past week or so (maybe from the upgrade?) if one goes to the Spatial Analyst

  • Add to Map > Species
  • Hakea
  • Facet > Scientific Name

A lot of species (and especially the most common ones like H. sericea and H. decurrens) get dumped into "Other scientificName" This was not the case a couple of weeks ago.

Looking at the output of the legend endpoint https://biocache-ws.ala.org.au/ws/webportal/legend?cm=scientificName&q=lsid%3Ahttps%3A%2F%2Fid.biodiversity.org.au%2Ftaxon%2Fapni%2F51300181&type=application/json

The first entry is Hakea arborescens with 3724 results. The last entry is Other scientificName with 81745 results.

{
  "name":"Other taxon_name",
  "i18nCode":"taxon_name.other",
  "facetValue":"",
  "count":81745,
  "colour":0,
  "fq":"-(-taxon_name:\"Embothrium coccineum\" AND -taxon_name:\"Hakea\" AND -taxon_name:\"Hakea 'Burrendong Beauty'\" AND -taxon_name:\"Hakea 'Kincora'\" AND -taxon_name:\"Hakea actites\" AND -taxon_name:\"Hakea aculeata\" AND -taxon_name:\"Hakea acuminata\" AND -taxon_name:\"Hakea adnata\" AND -taxon_name:\"Hakea aenigma\" AND -taxon_name:\"Hakea ambigua\" AND -taxon_name:\"Hakea amplexicaulis\" AND -taxon_name:\"Hakea anadenia\" AND -taxon_name:\"Hakea arborescens\" AND -taxon_name:\"Hakea archaeoides\" AND -taxon_name:\"Hakea asperma\" AND -taxon_name:\"Hakea auriculata\" AND -taxon_name:\"Hakea bakerana\" AND -taxon_name:\"Hakea bakeriana\" AND -taxon_name:\"Hakea baxteri\" AND -taxon_name:\"Hakea benthamii\" AND -taxon_name:\"Hakea bicornata\" AND -taxon_name:\"Hakea brachyptera\" AND -taxon_name:\"Hakea brownii\" AND -taxon_name:\"Hakea bucculenta\" AND -taxon_name:\"Hakea candolleana\" AND -taxon_name:\"Hakea carinata\" AND -taxon_name:\"Hakea ceratophylla\" AND -taxon_name:\"Hakea chordophylla\" AND -taxon_name:\"Hakea chromatropa\" AND -taxon_name:\"Hakea cinerea\")","red":116,"blue":17,"green":52,"remainder":true
}

Compare this with a facets search: https://biocache-ws.ala.org.au/ws/occurrence/facets?q=lsid%3Ahttps%3A%2F%2Fid.biodiversity.org.au%2Ftaxon%2Fapni%2F51300181&facets=taxon_name

Hakea arborescens is the 5th entry (same count). The first 4 names I would expect to be present in the legend output but are missing.

adam-collins commented 3 years ago

https://github.com/AtlasOfLivingAustralia/biocache-service/pull/662

RobinaSanderson commented 3 years ago

I'm not sure if I've tested this correctly but I'm getting different results between the search results page and the map view with the scientific name facet selected.

Steps:

  1. Enter Hakea as the search term
  2. open the Taxon group of facets
  3. Show more for the scientific name facet

image

Note the numbers found and that there are more Hakea arborescens than Hakea teretifolia

  1. Close the scientific name facet pop up
  2. Click on the map option (numbers stay the same
  3. Click on the "View in spatial portal" option
  4. Select the facet Scientific name image Numbers are higher. Looks like the quality profile being applied is ALA

There is now more Hakea teretifolia occurrences being found than Hakea arborescens

javier-molina commented 3 years ago

@adam-collins this is probably more related to the test environment set up. During the stand up many of us were surprised to see an instance of spatial hub sitting in the biocache-dq server. We are assuming this is now outdated. Could you please point to a more up to date spatial hub/service so we can confirm Robina's findings are no longer an issue?

adam-collins commented 3 years ago

Changed config of biocache-dq-test to open spatial-test. Changed config of spatial-test to use nci3-biocache-service. Still incorrect.

Remaining counts issue is in spatial-hub https://github.com/AtlasOfLivingAustralia/spatial-hub/issues/411

nickdos commented 3 years ago

@javier-molina - I don't think this is blocking the release of biocache-service or hubs, so should it be moved to To do/on hold column?

javier-molina commented 3 years ago

Legend and facet outputs are now consistent

Screen Shot 2021-07-20 at 12 06 35 pm

Other scientific name count for legend call has substantially dropped as a result

Screen Shot 2021-07-20 at 12 07 31 pm