Open canon-cmre-kym-eden opened 3 months ago
Yep, great idea. I always thought about ingested packages as something uncontrolled that comes from the outside world.
But a combination of local data with ingested information definitely a common usecase, so patch instead of update is a must-have feature at least:
The current behavior with unconditional reset to some specific state of the dataset is also sensible in certain scenarios(in my projects, at least:)), so I'll keep this option.
I'll probably add the following change for now:
action = "package_" + (
"update" if pkg and self.options.get("update_existing") else "create"
)
# will be changed to =>
update_strategy = self.options.get("update_strategy", "update")
action = "package_" + (
update_strategy if pkg and self.options.get("update_existing") else "create"
)
To keep it compatible with existing code. Everything remains unchanged by default, and you can set update_strategy: "patch"
record option to use package/resource patch
For the v2 of the extension, I'll think of a better interface.
I had unexpected loss of all resources from a project while building an ingestion pipeline that could update the dataset metadata. Turns out there is a
patch
variant of package and resource update: https://docs.ckan.org/en/2.9/api/#ckan.logic.action.update.package_updateMight be worth changing the update to be non-destructive, or introduce a multi-state (enum) parameter: https://github.com/DataShades/ckanext-ingest/blob/e6df1ff5df0e672280c9e1b18009aee38f3863d2/ckanext/ingest/record.py#L38-L40
https://github.com/DataShades/ckanext-ingest/blob/e6df1ff5df0e672280c9e1b18009aee38f3863d2/ckanext/ingest/record.py#L96-L100
Thanks