metaspace2020 / metaspace

Cloud engine and platform for metabolite annotation for imaging mass spectrometry
https://metaspace2020.eu/
Apache License 2.0
44 stars 9 forks source link

Size limitation for the PutObject operation #1469

Open sergii-mamedov opened 5 months ago

sergii-mamedov commented 5 months ago

After migrating to AWS for large datasets, an exception is observed when trying to save files related to imzML browser. This is due to limits other than IBM Cloud.

From AWS documentation:

Depending on the size of the data that you're uploading, Amazon S3 offers the following options:

Upload an object in a single operation by using the AWS SDKs, REST API, or AWS CLI – With a single PUT operation, you can upload a single object up to 5 GB in size.

Upload an object in parts by using the AWS SDKs, REST API, or AWS CLI – Using the multipart upload API operation, you can upload a single large object, up to 5 TB in size.

It is necessary to rewrite this part using multipart upload. We already have a similar implementation for python-client.

Traceback (most recent call last):
  File "/opt/dev/metaspace/metaspace/engine/sm/engine/daemons/lithops.py", line 91, in _callback
    self._manager.annotate_lithops(
  File "/opt/dev/metaspace/metaspace/engine/sm/engine/daemons/dataset_manager.py", line 116, in annotate_lithops
    ServerAnnotationJob(executor, ds, perf, perform_enrichment=perform_enrichment).run()
  File "/opt/dev/metaspace/metaspace/engine/sm/engine/annotation_lithops/annotation_job.py", line 347, in run
    self.results_dfs, self.png_cobjs, self.enrichment_data = self.pipe.run_pipeline(
  File "/opt/dev/metaspace/metaspace/engine/sm/engine/annotation_lithops/pipeline.py", line 104, in run_pipeline
    self.load_ds(use_cache=use_cache)
  File "/opt/dev/metaspace/metaspace/engine/sm/engine/annotation_lithops/cache.py", line 81, in wrapper
    return f(self, *args, **kwargs)
  File "/opt/dev/metaspace/metaspace/engine/sm/engine/annotation_lithops/pipeline.py", line 139, in load_ds
    ) = load_ds(
  File "/opt/dev/metaspace/metaspace/engine/sm/engine/annotation_lithops/load_ds.py", line 216, in load_ds
    (imzml_reader, ds_segments_bounds, ds_segms_cobjs, ds_segm_lens,) = executor.call(
  File "/opt/dev/metaspace/metaspace/engine/sm/engine/annotation_lithops/executor.py", line 390, in call
    return self.map(
  File "/opt/dev/metaspace/metaspace/engine/sm/engine/annotation_lithops/executor.py", line 295, in map
    raise exc
  File "/opt/dev/metaspace/metaspace/engine/sm/engine/annotation_lithops/executor.py", line 331, in run
    return_vals = executor.get_result(futures)
botocore.exceptions.ClientError: An error occurred (EntityTooLarge) when calling the PutObject operation: Your proposed upload exceeds the maximum allowed size

Related issues:

  1. https://github.com/lithops-cloud/lithops/issues/1239