Closed wang2bo2 closed 2 weeks ago
Whilst this appears to do the right thing, I wonder if you see any way that this can be isolated in s3fs rather than here?
ping @wang2bo2
Yes, we could remove the use of infer_storage_options
and duplicate the extraction of url_query
here
https://github.com/fsspec/s3fs/blob/f3f63cbfbfe71a4355abd63cafd8c678c4a5a0af/s3fs/core.py#L394
In the current infer_storage_options
implementation, anything after the last :
would be excluded from getting into options["host"]
https://github.com/fsspec/filesystem_spec/blob/9a161714f0bbfe44ee769f259420f2f7db975471/fsspec/utils.py#L97 even if the non-int port error is caught and handled. So I thought it's good to fix that too.
I think it would make sense to look for "accesspoint" in s3fs, but still use to infer_storage_options in the common case. Perhaps the accesspoint branch could simply modify the URL into something that makes sense. An ARN description isn't very much like a URL anyway...
Closing in favour of https://github.com/fsspec/s3fs/pull/912
Parsing S3 access point URI, like s3://arn:aws:s3:us-west-2:1234:accesspoint/abc/123.jpg, causes problems in
infer_storage_options
for the:accesspoint
was parsed as port number. This PR addresses that.