Ingesting multiple files concurrently (e.g., coming from zip files) can easily lead to submission's not being associated with the same dataset's. This can happen if no transaction has reserved a dataset_id yet therefore a file from the same dataset reserves a new dataset_id at the same time.
2) Split metadata ingestion from content ingestion, that would reduce the possible window of concurrent transactions. Additional brainstorming and ingestion logic is required in order to perform multi-file paralle ingestion.
This issue came to light when we began to accomodate multi-track submissions in parallel.
Ingesting multiple files concurrently (e.g., coming from zip files) can easily lead to submission's not being associated with the same
dataset
's. This can happen if no transaction has reserved adataset_id
yet therefore a file from the same dataset reserves a newdataset_id
at the same time.We can remedy this in the following ways...
1) Temporary workaround: disable parallel uploads from:
2) Split metadata ingestion from content ingestion, that would reduce the possible window of concurrent transactions. Additional brainstorming and ingestion logic is required in order to perform multi-file paralle ingestion.
This issue came to light when we began to accomodate multi-track submissions in parallel.