AtlasOfLivingAustralia / biocache-store

Occurrence processing, indexing and batch processing
Other
7 stars 24 forks source link

Remove support for non-replicable data loads when identifying terms are not defined #294

Closed ansell closed 5 years ago

ansell commented 5 years ago

There currently exists the possibility of loading a dataset with records where the collectory is missing a list of identifying terms (composite primary key). These data loads are not replicable, and will create new UUIDs each time, adding to the previous set of records rather than updating them.

The correct behaviour in this case is to throw an error and not attempt to create a UUID from an illformed data resource description, rather than leave the issue for future data analysts to discover.

https://github.com/AtlasOfLivingAustralia/biocache-store/blob/40a6ddf6fe518238df5a913071edc99a04e5555e/src/main/scala/au/org/ala/biocache/load/DwCALoader.scala#L241

https://github.com/AtlasOfLivingAustralia/biocache-store/blob/40a6ddf6fe518238df5a913071edc99a04e5555e/src/main/scala/au/org/ala/biocache/load/DataLoader.scala#L225