Open drmalex07 opened 9 years ago
This behaviour is observed in an instance running
Consider the case when a new resource is uploaded. I think that when archiver's download is trying to update metadata for the given resource (https://github.com/ckan/ckanext-archiver/blob/master/ckanext/archiver/tasks.py#L451) is causing a new IDomainObjectModification event to be fired. Thus, datastorer is notified again (because of this else clause: https://github.com/ckan/ckanext-datastorer/blob/master/ckanext/datastorer/plugin.py#L34) and a new task is sent to the queue.
download
IDomainObjectModification
I suppose that since the time of arrival of the second event is random (and of course the queue can run many parallel workers), this can lead to undesirable races if 2 parallel tasks are sending groups of records to the datastore table (?).
This behaviour is observed in an instance running
Consider the case when a new resource is uploaded. I think that when archiver's
download
is trying to update metadata for the given resource (https://github.com/ckan/ckanext-archiver/blob/master/ckanext/archiver/tasks.py#L451) is causing a newIDomainObjectModification
event to be fired. Thus, datastorer is notified again (because of this else clause: https://github.com/ckan/ckanext-datastorer/blob/master/ckanext/datastorer/plugin.py#L34) and a new task is sent to the queue.I suppose that since the time of arrival of the second event is random (and of course the queue can run many parallel workers), this can lead to undesirable races if 2 parallel tasks are sending groups of records to the datastore table (?).