nodestream-proj / nodestream

A Declarative framework for Building, Maintaining, and Analyzing Graph Data
https://nodestream-proj.github.io/docs/
Apache License 2.0
37 stars 11 forks source link

Use Multi-Part Copy when Archiving S3 Objects from Processing #245

Closed angelosantos4 closed 9 months ago

angelosantos4 commented 9 months ago

Some pipelines have been failing because of the following error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/nodestream/pipeline/extractors/stores/aws/s3_extractor.py", line 90, in extract_records
    self.archive_s3_object(key)
  File "/usr/local/lib/python3.11/site-packages/nodestream/pipeline/extractors/stores/aws/s3_extractor.py", line 51, in archive_s3_object
    self.s3_client.copy_object(
  File "/usr/local/lib/python3.11/site-packages/botocore/client.py", line 553, in _api_call
    return self._make_api_call(operation_name, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/botocore/client.py", line 1009, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidRequest) when calling the CopyObject operation: The specified copy source is larger than the maximum allowable size for a copy source: 5368709120

Googled the error and this may be a fix: https://stackoverflow.com/questions/52879356/boto3-copy-object-failing-on-size-5gb

codecov[bot] commented 9 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (7042ce2) 95.90% compared to head (0c45043) 95.90%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #245 +/- ## ======================================= Coverage 95.90% 95.90% ======================================= Files 132 132 Lines 3930 3930 ======================================= Hits 3769 3769 Misses 161 161 ``` | [Flag](https://app.codecov.io/gh/nodestream-proj/nodestream/pull/245/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nodestream-proj) | Coverage Δ | | |---|---|---| | [3.10-macos-latest](https://app.codecov.io/gh/nodestream-proj/nodestream/pull/245/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nodestream-proj) | `95.90% <100.00%> (ø)` | | | [3.10-ubuntu-latest](https://app.codecov.io/gh/nodestream-proj/nodestream/pull/245/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nodestream-proj) | `95.90% <100.00%> (ø)` | | | [3.10-windows-latest](https://app.codecov.io/gh/nodestream-proj/nodestream/pull/245/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nodestream-proj) | `95.80% <100.00%> (ø)` | | | [3.11-macos-latest](https://app.codecov.io/gh/nodestream-proj/nodestream/pull/245/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nodestream-proj) | `95.90% <100.00%> (ø)` | | | [3.11-ubuntu-latest](https://app.codecov.io/gh/nodestream-proj/nodestream/pull/245/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nodestream-proj) | `95.90% <100.00%> (ø)` | | | [3.11-windows-latest](https://app.codecov.io/gh/nodestream-proj/nodestream/pull/245/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nodestream-proj) | `95.80% <100.00%> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nodestream-proj#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.