Open james-willis opened 8 months ago
I am assuming you want to use the loaders in the browser against a private S3 bucket? Normally signing of URLs happens on the backend. I.e. the browser must request signed URLs from the backend via an API. So we need an async callback where you can do that lookup.
I am assuming you want to use the loaders in the browser against a private S3 bucket?
Yes. I've edited my original post to call them signed urls rather than presigned URLs for clarity.
Signing URLs is just a mechanism through which this feature could be implemented.
Signing urls can be generated anywhere that the aws credentials with read access are available. I don't believe there is any need to call back to do a lookup; signed url calculation is a pure function of the url and the credentials
signed url calculation is a pure function of the url and the credentials
Yes but normally you don't want to send credentials to the front-end. If you are willing to send the credentials to the front-end then someone can intercept them and start signing URLs to anything in your bucket (perhaps another customer's data, if you store data from multiple customers in the same bucket).
My question is basically, are we designing for the harder case, when the app developer is not willing to send credentials to the client and is standing up a custom url signing endpoint on their backend. Or is this for the smaller subset of apps that are willing to send signing credentials to their front-end?
My perspective comes from someone who primarily is using loaders.gl to back pydeck or jupyter-keplergl. In those cases the front end usually has access to the credential that are available on the backend already.
For these usecases I would prefer that those credentials be leveraged. I think calling to the backend for each request adds unneeded complexity and latency for these usecases.
Generally, I am imagining cases where the end user has some credentials to provide to the application in order to access the private data.
My perspective comes from someone who primarily is using loaders.gl to back pydeck or jupyter-keplergl
That is helpful, if you are in Python it means that it is harder to override JS callbacks. Your design seems to indicate that you have the option to install extra JS packages.
My biggest objection to current proposal is that it is building "knowledge" about S3 into loaders.gl/core (i.e. there is a function namedfetchS3
function). It needs to be done through a more abstract, pluggable model...
Loading data from S3 is a common use case today. With loaders.gl it is simple enough to perform this task against public data as it is easy to provide an https URL to an s3 object. Against single-object datasets a signed URL can be generated to pass to the loader.
However, for tile datasets where a URL template is provided, the signed URL will vary with each tile. I'd like to request a feature in loaders.gl that adds broad support for s3 urls, supporting both public and private objects.
Users will pass the S3 url, as well as potentially passing S3ClientConfig for private objects.
I have mocked up a potential implementation approach, but I'm unsure on how passing credentials should work and iff interface changes should be made.
Related request in deck.gl: https://github.com/visgl/deck.gl/issues/8590