Open M-Nicholls opened 2 years ago
Initially reported in https://support.ehelp.edu.au/a/tickets/101299
Ran a quick test, the filters are fq=-basisOfRecord:"FOSSIL_SPECIMEN" AND -(basisOfRecord:"MATERIAL_SAMPLE" AND contentTypes:"EnvironmentalDNA")
we inverse it and run a query to get all the excluded records, the query is https://biocache.ala.org.au/occurrence/search?q=lsid:966799e0-b36c-445f-9e46-69a7085ed00d&qualityProfile=ALA&disableAllQualityFilters=true&fq=basisOfRecord:"FOSSIL_SPECIMEN"+(+basisOfRecord:MATERIAL_SAMPLE++contentTypes:EnvironmentalDNA)
which is actually incorrect I think.
I see, the record-type fq=-basisOfRecord:"FOSSIL_SPECIMEN" AND -(basisOfRecord:"MATERIAL_SAMPLE" AND contentTypes:"EnvironmentalDNA")
was constructed with latest dq-service, which allows user to input arbitrary filters, see the -( AND )
Backend should have problem in inversing it.
I ran biocache-service
locally with ssh connected to prod solr. When I ran http://localhost:8078/occurrences/search?q=lsid:966799e0-b36c-445f-9e46-69a7085ed00d&qualityProfile=ALA&disableAllQualityFilters=true&fq=basisOfRecord:"FOSSIL_SPECIMEN" (+basisOfRecord:"MATERIAL_SAMPLE" +contentTypes:"EnvironmentalDNA")
result is correct (744 records) returned.
The excluded records url generated by ala-hub is http://dev.ala.org.au:8081/ala-hub/occurrence/search?q=lsid:966799e0-b36c-445f-9e46-69a7085ed00d&qualityProfile=ALA&disableAllQualityFilters=true&fq=basisOfRecord:"FOSSIL_SPECIMEN"+(+basisOfRecord:MATERIAL_SAMPLE++contentTypes:EnvironmentalDNA)
which returns 0 record.
notice there is no "" around MATERIAL_SAMPLE
and EnvironmentalDNA
adding back the "" making the url to http://dev.ala.org.au:8081/ala-hub/occurrence/search?q=lsid:966799e0-b36c-445f-9e46-69a7085ed00d&qualityProfile=ALA&disableAllQualityFilters=true&fq=basisOfRecord:"FOSSIL_SPECIMEN"+(+basisOfRecord:"MATERIAL_SAMPLE"++contentTypes:"EnvironmentalDNA")
it returns 744 records that just want we want
call https://data-quality-service.ala.org.au/api/v1/quality/getAllInverseCategoryFiltersForProfile?qualityProfileId=92 you can see
"record-type": "basisOfRecord:\"FOSSIL_SPECIMEN\" (+basisOfRecord:MATERIAL_SAMPLE +contentTypes:EnvironmentalDNA)",
When a search includes records that are Material Samples, which are excluded by default, Amanita roseolamellata for example: https://biocache.ala.org.au/occurrences/search?q=lsid:966799e0-b36c-445f-9e46-69a7085ed00d If you click on the "744 records excluded" link to show those, that search finds no records: https://biocache.ala.org.au/occurrence/search?q=lsid%3A966799e0-b36c-445f-9e46-69a7085[…]ord%3AMATERIAL_SAMPLE+%2BcontentTypes%3AEnvironmentalDNA%29 It does not seem to affect searches for any other type of excluded record, including where some are excluded due to record type but are not of type Material Sample, eg. these 5 Osphranter rufus records are excluded because they are Fossil Specimens: https://biocache.ala.org.au/occurrence/search?q=lsid%3Aurn%3Alsid%3Abiodiversity.org.a[…]ord%3AMATERIAL_SAMPLE+%2BcontentTypes%3AEnvironmentalDNA%29
Other species that the user has identified have this problem are: Hypholoma fasciculare, Lentinus arcularius, Amanita xanthocephala
there's something odd going on with bracketing in the show excluded records e.g. https://biocache.ala.org.au/occurrence/search?q=lsid%3A966799e0-b36c-445f-9e46-69a7085[…]basisOfRecord:MATERIAL_SAMPLE+contentTypes:EnvironmentalDNA works fine but the query used in the interface: https://biocache.ala.org.au/occurrence/search?q=lsid%3A966799e0-b36c-445f-9e46-69a7085[…]ord%3AMATERIAL_SAMPLE+%2BcontentTypes%3AEnvironmentalDNA%29 has brackets around the second parameters +(%2BbasisOfRecord%3AMATERIAL_SAMPLE+%2BcontentTypes%3AEnvironmentalDNA) it's hard to see due to the conversion to the character codes.