AtlasOfLivingAustralia / biocache-service

Occurrence & mapping webservices
https://biocache-ws.ala.org.au/ws/
Other
9 stars 26 forks source link

Occurrence record - RecordedbyId - not displaying as provided in biocache or biocache-test #898

Closed adam-collins closed 2 months ago

adam-collins commented 2 months ago

RecordedbyID not being processed correctly. Input data and DWCA from preingestion have been double-checked, there's no further way for us to investigate.

Dataset turtleSat Test: dr22433 -https://biocache-test.ala.org.au/occurrences/search?q=data_resource_uid%3Adr22433 Prod: dr26141 - https://biocache.ala.org.au/occurrences/search?q=data_resource_uid%3Adr26141

Example record catalogNumber: TU-SI-14751 RecordedById is being provided as an integer/character string for example: 22075

Test: - https://biocache-test.ala.org.au/occurrences/3b6e6e75-3911-427a-9e27-46f8a43a8f0a Prod: - https://biocache.ala.org.au/occurrences/71834157-b9cf-4fac-b192-0d89af358247

After ingestion and loading it is being displayed as: Test: 25075.0 Prod: 25075|25075

There is also an issue when viewing Original vs Processed values Test: Original: 22075.0 Processed: 22075.0 Prod: Original: 22075|22075 Processed: null

I have checked the DWCA created by preingestion in S3 and the occurrence CSV shows the correct format of the recordedbyID field. Checked both via excel and text editor to ensure that source data values are correct.

See screenshots below.

Input data

Image

Prod Occurrence records

Image Image

Test Occurrence records Image Image

adam-collins commented 2 months ago

This was identified as a pre-pipelines CSV to DWCA conversion issue.