fox-it / flow.record

Recordization library
GNU Affero General Public License v3.0
7 stars 9 forks source link

Add Google Cloud Storage Adapter #84

Open MaxGroot opened 11 months ago

MaxGroot commented 11 months ago

Depends on https://github.com/fox-it/flow.record/pull/83.

This pull request adds a Google Cloud Storage adapter for flow.record. It supports both reading from and writing to Google Cloud Storage in a streaming manner.

codecov[bot] commented 11 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (676d61c) 80.06% compared to head (c3747b6) 80.38%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #84 +/- ## ========================================== + Coverage 80.06% 80.38% +0.32% ========================================== Files 33 34 +1 Lines 3110 3161 +51 ========================================== + Hits 2490 2541 +51 Misses 620 620 ``` | [Flag](https://app.codecov.io/gh/fox-it/flow.record/pull/84/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=fox-it) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/fox-it/flow.record/pull/84/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=fox-it) | `80.38% <100.00%> (+0.32%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=fox-it#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter commented 5 months ago

Codecov Report

Attention: Patch coverage is 96.34146% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 81.94%. Comparing base (4a47670) to head (873ba91).

Files Patch % Lines
flow/record/base.py 89.65% 3 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #84 +/- ## ========================================== + Coverage 81.64% 81.94% +0.30% ========================================== Files 34 35 +1 Lines 3307 3362 +55 ========================================== + Hits 2700 2755 +55 Misses 607 607 ``` | [Flag](https://app.codecov.io/gh/fox-it/flow.record/pull/84/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=fox-it) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/fox-it/flow.record/pull/84/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=fox-it) | `81.94% <96.34%> (+0.30%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=fox-it#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

MaxGroot commented 5 months ago

@yunzheng To support transparent compression when writing to filelike objects, I've added wrap_in_compression to base.py. This wraps fp in a compressor if this seems sensible based on a given path. This is copied from open_path, though changed to always instantiate the compressor using a filelike object. This should solve the problem of writing a gzipped file to a GCS bucket, though cannot test it for realsies.

An unfortunate consequence is that the RecordAdapter method has become just a tad more complex than it already was. We've discussed refactoring it in the past, but I think it will remain complex due to how flexible it is expected to be. It has to account for a URL, a file-like object, and whether or not the adapter should be a writer or a reader, with some complexity mixed in for handling stdio. If you have ideas for refactoring it properly I'm all ears, though that may be something for a different PR.