Closed charvolant closed 3 months ago
It has been fine like this for a long time and I do worry such a change will break something in biocache-hubs.
subspecies
and subspeciesID
are only 2 of the fields that need to be changed.raw
; assertions, rowKey, uuid, firstLoaded, attribution, etcprocessed
; attribution, location, modified etcNot to mention that if a raw field is absent the non-raw field is used intentionally in the RAW section.
It is more useful to deprecate this output format and version a format consistent with the download format, i.e. a format capable of listing of all fields in a flat structure that can reference index/fields
for further information.
Of course we could ignore everything else inconsistent and just fix these 2 fields, and do all of this again next time someone raises an issue of any of the other inconsistencies.
Only moving subspecies and subspeciesID. pull request https://github.com/AtlasOfLivingAustralia/biocache-service/pull/864
Nefarious processed subspecies content is no longer appearing in raw: https://biocache-ws-test.ala.org.au/ws/occurrence/da76bbe0-0539-4051-bf08-9080a9f12775 It would be nice to review all of the raw/processed data fields and we are likely to hit on this at some stage soon. Happy to leave it at this. @nielsklazenga with the Darwin Core Compliance work, we should add that we want to review what comprises raw/processed values.
See for example https://api.ala.org.au/occurrences/occurrences/da76bbe0-0539-4051-bf08-9080a9f12775
This record has an invalid name match caused by misprocessing and difficulty parsing the supplied name. However, it shows another error where the derived subspecies is inserted into the raw data. This seems to be coming from the service, rather than the SOLR index.
The originally supplied data is
The information in the solr index is
There is no
raw_subspecies
in the solr documentThe data returned by the API call, with assertions removed for brevity is
raw.classification.subspecies
andraw.classification.subspeciesID
contain values not in the original data.