Open lubitchv opened 2 years ago
@lubitchv one proposal I remember was to have Dataverse only allow n-1 concurrent ingests, where n equals the number of cores available on the node. I don't find that in an open issue, though.
Limiting number of concurrent ingests will resolve security issue but will not resolve our problem of uploading to Dataverse relatively large number datasets with ingest in reasonable timeframe.
For relatively big SPSS files (150-400MB) ingest is very slow. It usually takes 1-3 hours. Some files can be stuck in the ingest process for 12 hours. Ingest usually takes 100% CPU and hence maximum number of simultaneous ingests only can be less or equal to the number of CPUs on the server. We are in the process of transferring from Nesstar and have thousands of datasets with relatively large SPSS files. With such slow ingest we have difficulties to transition. So it would be useful to have some optimization of ingest code to speed the ingest.