Closed pnuu closed 1 year ago
This is what I have in Trollmoves S3Mover
docstring:
All the connection configurations and such are done using the `fsspec` configuration system:
https://filesystem-spec.readthedocs.io/en/latest/features.html#configuration
An example configuration could be for example placed in `~/.config/fsspec/s3.json`::
{
"s3": {
"client_kwargs": {"endpoint_url": "https://s3.server.foo.com"},
"secret": "VERYBIGSECRET",
"key": "ACCESSKEY"
}
}
Merging #117 (b0b7698) into main (62a1d71) will decrease coverage by
0.07%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## main #117 +/- ##
==========================================
- Coverage 90.52% 90.46% -0.07%
==========================================
Files 22 25 +3
Lines 3577 3848 +271
==========================================
+ Hits 3238 3481 +243
- Misses 339 367 +28
Impacted Files | Coverage Δ | |
---|---|---|
pytroll_collectors/segments.py | 92.22% <100.00%> (+0.10%) |
:arrow_up: |
pytroll_collectors/tests/test_segments.py | 100.00% <100.00%> (+0.72%) |
:arrow_up: |
pytroll_collectors/__init__.py | 100.00% <0.00%> (ø) |
|
pytroll_collectors/geographic_gatherer.py | 88.14% <0.00%> (ø) |
|
pytroll_collectors/s3stalker.py | 73.01% <0.00%> (ø) |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
I created a new issue on documenting S3 configuration. It depends on completion of #110 and #114.
It seems that when checking for the segments, the scheme
part is not included when adding the segment to the message dataset. So they would be bucket-name/file1.seg
instead of s3://bucket-name/file1.seg
. I'll have a look at this tomorrow.
Hngh, I have somehow messed up parts of the PR and don't know where some of the code came from..
Ok, the results from s3fs.S3FileSystem.glob()
don't include 's3://'
in them. Now the collection works after a restart with usable URIs and I consider this completed (until review, at least :grin:).
As for the mysteriously appeared code fragment (https://github.com/pytroll/pytroll-collectors/pull/117/commits/53a8f74a038ff3297ae31814372c13134f7f2567#diff-76bd104754afbf31db6d4cf2bfa4f2df0dcb2fd57bbf7b9ca62b67be7cbeea8dL383), I have no idea where it came from. I just revered to the original (the version in main
) for simplicity.
Finally remembered to document the S3 configuration. @mraspaud ok to merge?
With https://github.com/pytroll/pytroll-collectors/pull/117/commits/9af66a82b9aef392d5a4c3d8e2ba0636a0515a3c all glob()
calls are done using fsspec
.
@mraspaud Do you have time to look at this one again?
@mraspaud one final look, please?
This PR adds a possibility to check for existing file segments on S3 storage when segment gatherer is started. This is already possible with local file systems by defining
check_existing_files_after_start = True
in the configuration file.Not sure where to explain how the S3 configuration (end point, credential) is done.