awspring / spring-cloud-aws

The New Home for Spring Cloud AWS
http://awspring.io
Apache License 2.0
892 stars 303 forks source link

S3Resource getUri() #521

Open fjmacagno opened 2 years ago

fjmacagno commented 2 years ago

Type: Feature

Is your feature request related to a problem? Please describe. I have found that several people i work with have assumed that S3Resource.getUri() would return the s3 url for the resource, aka s3://bucket/key, and run into unexpected trouble when it comes back as an http url. This in particular causes trouble when interacting with other libraries like Parquet, which cannot use an InputStream to read and which ideally we should be able to pass the uri to directly.

Describe the solution you'd like Why is it this way? Couldn't it return the s3 path? Is that even changeable at this point, if it is desirable?

Describe alternatives you've considered Im currently telling people to string-parse the uri host for the bucket and use the uri path for the key and build the s3 url manually.

Additional context Looking at the code, it seems like the problem may be that the default implementation of getUri() in AbstractResource parses the result of getURL(), and URLs cant have filesystems that Java can't recognize. It seems like just overriding getUri() would fix it, but then you do have the URL and the URI being meaningfully different, which is its own disadvantage.

maciejwalkowiak commented 2 years ago

@fjmacagno we can add getS3Url() method to S3Resource. Would it do the job for you?

fjmacagno commented 2 years ago

Better than nothing certainly, but not exactly, since what i want to maintain is that the user of the resource doesnt have to know the protocol; having to conditionally cast to S3Resource would defeat the purpose a bit. The problem is just that the http s3 url is not usable by anything. An option might be to add S3 as a valid url protocol.

An option to maintain backward compat would be to add a new bean "S3ResourceUriGenerator" which defaults to the current implementation, and which i assume can be wired into the S3Resource when it is created. This seems like a clunky answer though, and since backwards compat isn't technically necessary for 3.0.0, actually changing the return seems better.

maciejwalkowiak commented 1 year ago

Apologies for late reply @fjmacagno. As clunky as it is, I think it is acceptable. In this specific case i prefer we stay backward compatible.

maciejwalkowiak commented 1 year ago

Perhaps related to https://github.com/awspring/spring-cloud-aws/issues/166