laminlabs / lamindb-setup

Setup & configure LaminDB.
Apache License 2.0
4 stars 1 forks source link

Enable to pass storage region outside of `create-s3` #768

Open falexwolf opened 1 month ago

falexwolf commented 1 month ago

We also need to be able to pass a storage region outside the storage param because otherwise during init_storage_hub we'll try to infer it while not yet having access:

https://github.com/laminlabs/lamindb-setup/blob/0501bb320d392a4e5ebf6c493ff10663c474697e/lamindb_setup/core/_hub_core.py#L158

I'll make a PR that adds this ability.

Koncopd commented 1 month ago

I am not sure what do you mean here. Do we want to force passing the region on storage init? I think we need to make sure that the region is resolved even without access rights instead of forcing manual passing. https://github.com/fsspec/s3fs/blob/a28863f084a91ee78d9cd65bd6767b2f44e81b33/s3fs/utils.py#L40 This is how it is done in s3fs, it should be able to retrieve region even when the access is denied because error response also carries the region if i understand corretly.

falexwolf commented 1 month ago

I mean that we get a "HeadBucket operation: Forbidden" error because it's calling the below before the storage row is in the hub and access-aws can be called:

https://github.com/laminlabs/lamindb-setup/blob/0501bb320d392a4e5ebf6c493ff10663c474697e/lamindb_setup/core/_settings_storage.py#L52

If we can get region without calling head_bucket: great!

Koncopd commented 1 month ago

We still call the bucket and but we check the error header in the response and it should contain the region (see the code in s3fs above), i can add this to get_storage_region tmr.

falexwolf commented 1 month ago

That'd be a very elegant solution! Anytime this or early next week is fine! We worked around this issue where needed it so far and it's nothing that a user will face.