AtlasOfLivingAustralia / la-pipelines

Living Atlas Pipelines extensions
3 stars 4 forks source link

Species list indexing of authoritative lists only showing a single taxon in results #523

Closed nickdos closed 2 years ago

nickdos commented 2 years ago

Some authoritative species lists, which are indexed via Pipelines, are not showing the expected results in biocache. E.g. the GRIIS list (dr9884) has 3549 distinct taxa but the SOLR search using the DR shows only a single taxon being searched:

https://biocache.ala.org.au/occurrences/search?q=species_list_uid:dr9884

Returns only 101 records with a single taxon of Acentrogobius pflaumii.

Investigate why it is not returning records for all 3549 matched taxa.

nickdos commented 2 years ago

CC: @peggynewman @djtfmartin @adam-collins

peggynewman commented 2 years ago

Another species that's on this list that should return records for that query: https://biocache.ala.org.au/occurrences/search?q=lsid:https://id.biodiversity.org.au/node/apni/2904941#tab_recordsView eg https://biocache.ala.org.au/occurrences/e347f854-c42f-43b6-97c0-30e6edf7beb9

djtfmartin commented 2 years ago

Fix for this is on the pipelines side. The issue is line breaks in the CSVs that are downloaded from species lists tool.

PR for review here: https://github.com/gbif/pipelines/pull/691

djtfmartin commented 2 years ago

fixed with https://github.com/gbif/pipelines/pull/691