qoomon / aws-s3-bucket-browser

Single page application to browse AWS S3 bucket content
https://qoomon.github.io/aws-s3-bucket-browser/index.html?bucket=https://s3.amazonaws.com/spacenet-dataset#
MIT License
246 stars 85 forks source link

Support for Virtual Style Paths on Open Hosted Implementation over HTTPS #2

Closed qkflies closed 4 years ago

qkflies commented 4 years ago

First, thanks for developing this project. I spent hours searching for something that does what yours does, and it seems to be the only currently working solution. The slick implementation only increases my awe. 🥇

I do have an edge concern: Amazon will be deprecating path-style requests come this October, which seems to at least partially break the open hosted implementation for your tool.

Their preferred request format, Virtual Hosted-Style, botches support for HTTPS. Wildcard SSL certs can only extend up one subdomain level, which disallows support for multisegmented bucket names (such as alpha.bravo.s3.us-east-1.amazonaws.com). The warning can be bypassed on internally hosted (by adding an exception), but open hosted throws a CORS error (which I believe to be symptomatic of the SSL certificate mismatch).

Open hosted implementation also can't bypass by passing an HTTP request, as that will trigger a mixed content block that can't be overridden.

From previous attempts to find a solution, I found that tunneling through CloudFront (AWS CDN solution) restores the ability to request via HTTPS, with seemingly the same XML file returned. However, when parsed the folders disappear, only leaving the top level files displayed.

I've linked examples utilizing your demo to show the difference. If you can shed any light on what's happening, that would be awesome.

Path-Style Request (working properly)

Virtual Hosted-Style (missing all directories)

Apologies in advance for this novel of an issue report, and thanks in advance for your consideration.

qoomon commented 4 years ago

Hi, glad to hear you're enjoying this project. I had a first look at the responses of https://dogow7tkf0owu.cloudfront.net/?list-type=2&delimiter=/&prefix=&max-keys=50 It looks like your "folders" are actual bucket objects. I explicitly filter those old fashioned folder placeholder objects, by not displaying all object with a trailing / . Because those folder object would be displayed as files, cause they are :-D.

https://github.com/qoomon/aws-s3-bucket-browser/blob/a9222a15d66443ec3a8c4a1a61bbeb3f838e4dbe/index.html#L344

To see folder just put some objects into our bucket with an object keys like

Now it should work as expected. If you don't like that behaviour just remove the mentioned line.

qoomon commented 4 years ago

If you need a feature to display those dummy object folders I may could add a feature like this.

qoomon commented 4 years ago

WDYT?

qkflies commented 4 years ago

Taking your suggestions in order, removing line 344, it comes with an odd side effect: the folder items are all "listed" on the top level. The lines are blank (as you can see on the screenshot below), but I'd almost be certain that it's similar to a tree display (i.e. demo1, demo1/demo3, demo2). image

Additionally, adding dummy files doesn't quite fix it. All the files in the bucket are returned at the top level (see here for line 344 enabled, and here for line 344 disabled). The good news is that the links still work, so access to the files can still be gained. However, hierarchy is still lost.

If the fix of manually recognizing bucket objects with zero size would rectify the issue, that would be probably be ideal. Since the push to HTTPS for everything has pretty much reached full saturation, I'd imagine the use case of using CloudFront to ensure serving through HTTPS will only increase. Ensuring that the same behavior can be replicated direct through S3 and tunneled through CloudFront would be reassuring.

I could be wrong, but it unfortunately appears that the problem is probably more fundamental than that. For what is worth, I'll note that both https://dogow7tkf0owu.cloudfront.net and https://s3.us-east-2.amazonaws.com/resources.seminolepointe.church are the same bucket, and (as far as I can tell) return identical XML files when viewed in the browser. They just return different results when plugged into your tool.

qoomon commented 4 years ago

I see. I'll investigate. Thx for reporting.

qoomon commented 4 years ago

problem seem to be that both endpoints have different apis somehow.

qoomon commented 4 years ago

At least handling of virtual folder object should work now.

qoomon commented 4 years ago

Everything seems to work now.

qkflies commented 4 years ago

So, I feel stupid now. CloudFront by default does not forward query strings.

Changing that option fixed the issue entirely. Which, I think, obviates this entire ticket.

In case anyone else stumbles on this issue, here's the configuration settings required for CloudFront to work:

I'm not certain if caching any of the parameters is advisable or not, but the forwarding is the important part.

Apologies for dragging you into this, but additionally, thanks for the help in identifying the issue!

qoomon commented 4 years ago

Glad to hear that. In my opinion you can cache the parameters as well if your bucket content is quite static. I will add your hint to the readme.