itchio / itch.io

:bug: Public itch.io issues tracker and documentation - use support instead for private information!
https://itch.io/support
235 stars 30 forks source link

Permission issue when uploading brotli compressed files. #1625

Closed LimaneGaya closed 6 days ago

LimaneGaya commented 3 weeks ago

I have a godot web game that i wanted to precompress with brotli (50mb -> 7mb). I found in the documentation that itch works with brotli but when i try the game i get a lot of 404 back.

My folder structure is:

index.apple-touch-icon.png
index.audio.worklet.js
index.html
index.icon.png
index.js
index.pck.br
index.png
index.wasm.br
index.worker.js

The game works when it's not ending with .br but send a 403 Forbidden when it does.

leafo commented 2 weeks ago

https://github.com/itchio/zipserver/blob/master/zipserver/archive.go#L326

When you provide a .br file, the suffix is trimmed for the uploaded file, so index.wasm.br becomes index.wasm with the content encoding set accordingly.

The 403 error is a catch all for file doesn't exist at that path: https://itch.io/docs/creators/html5#common-pitfalls See also: https://itch.io/docs/creators/html5#compression

This means you are trying to request a file by name that is not included in your archive. if you have a brotli encoded file with the path index.wasm.br make sure your game engine requests that full path.

Note that uploading a pre-compressed brotli file without the .br extension will result in a file with an incorrect content-encoding header. If you want to have a brotli compressed resource that will automatically be decompressed by the browser on request you must have that resource's path end in .br.

If you upload a brotli compressed file without the extension .br then the game engine must be able to decode the file in JavaScript to read the file (which is not ideal, but I believe Unity has this fallback).

LimaneGaya commented 2 weeks ago

After trying again it found out that it's not the issue, it does recognize the suffix and remove it, however there seems to be a permission issue. I made a mistake, the error code i get is 403 Forbidden.

I uploaded the same game one with brotli compression and the other without.

https://html-classic.itch.zone/html/10867486/mygame/mygame.wasm This is the wasm file with brotli and result in:

<Error>
<Code>AccessDenied</Code>
<Message>Access denied.</Message>
<Details>Anonymous caller does not have storage.objects.get access to the Google Cloud Storage object. Permission 'storage.objects.get' denied on resource (or it may not exist).</Details>
</Error>

https://html-classic.itch.zone/html/10867446/mygame/mygame.wasm The one without brotli works normally

leafo commented 2 weeks ago

Sorry, I typed the wrong information. The br suffix is not removed from the URL. From your example, here's the working URL: https://html-classic.itch.zone/html/10867486/mygame/mygame.wasm.br

Note that a 403 error is a catch all for file doesn't exist at that path: https://itch.io/docs/creators/html5#common-pitfalls See also: https://itch.io/docs/creators/html5#compression

LimaneGaya commented 2 weeks ago

Thanks this does seem to work, but shouldn't the .br be removed?

Correct me if i'm mistaken but normally when brotli is setup in server to compress on the fly that extension isn't added. It only compress the file and set contentencoding to "br".

Precompressing it would give better compression and avoid processing from itch servers but should work as if it's on the fly ( Requesting game.wasm would send game.wasm that was uploaded as game.wasm.br but only add contentencoding="br"

Is there a usecase for how it's setup right now?

jlowry commented 1 week ago

@leafo The documentation you linked to states the br extension is removed. Quote from https://itch.io/docs/creators/html5#compression emphasis mine:

If the filename ends with the extension .br then we'll assume that the content is Brotli compressed and the content-encoding will bet set to br. The The content-type header of the file will then be detected and set by the extension, after removing the .br.

I think not removing the extension is a bug. The .br should be removed by itch.io's webserver when it serves the files, thus allowing us to provide versions of the same file with different compression methods and allowing the itch.io webserver to choose the best from the available based on the Accept-Encoding header.

leafo commented 1 week ago

@jlowry @LimaneGaya

The documentation you linked to states the br extension is removed

It's confusingly worded but not incorrect: the mime-type detection works on the filename after the .br extension is stripped, but the actual paths of the files uploaded remain exactly as provided.

The .br should be removed by itch.io's webserver when it serves the files, thus allowing us to provide versions of the same file with different compression methods

We serve files directly from a CDN pointing to a storage bucket, there is no server side logic for content-encoding negotiation going on. Additionally, even for people who set up their own host to serve files, dealing with pre-compressed files is not a trivial configuration (mostly because the brotli stream does not have anything in it that suggests it is brotli encoded, and the filesystem doesn't have any metadata about the content encoding, hence the .br suffix is commonly used to identify a pre-compressed file, similar to how we use a .gz extension for gzip compressed files. Since these are big files, on-the-fly compression is a bad idea, even though that typically is the standard for text based web resources). The game engines I investigated at time of implementation that generate brotli files request the full path with .br suffix.

I also want to note that if you check https://itch.io/docs/creators/html5#compression/compressed-file-types you can see that we transparently apply gzip compression through the CDN to bare .wasm files, so although you may not get the best compression out of it, it's going to be a lot better than nothing.

jlowry commented 1 week ago

There is an action here.

You should remove the section quoted below from the documentation as it is implementation detail that is not visible to users.

after removing the .br.

leafo commented 1 week ago

There is an action here.

You should remove the section quoted below from the documentation as it is implementation detail that is not visible to users.

after removing the .br.

It is relevant because that is how the content-type header is set: mime-type detection via the extension of the file

jlowry commented 1 week ago

There is an action here. You should remove the section quoted below from the documentation as it is implementation detail that is not visible to users.

after removing the .br.

It is relevant because that is how the content-type header is set: mime-type detection via the extension of the file

Its seems there is a misunderstanding somewhere. I have some clariify questions:

  1. What mechanism is used to detect if the file is brotli?
  2. Is the file saved in the bucket without the extension?
  3. Is there anyway to download the file without the .br extension?
  4. If there is no way for us to access the file without the extension why should the documentation even tell us it is removed?
  5. Please could you show us the code that handles the detection so that we can help improve the documentation?
leafo commented 1 week ago

There is an action here. You should remove the section quoted below from the documentation as it is implementation detail that is not visible to users.

after removing the .br.

It is relevant because that is how the content-type header is set: mime-type detection via the extension of the file

Its seems there is a misunderstanding somewhere. I have some clariify questions:

  1. What mechanism is used to detect if the file is brotli?
  2. Is the file saved in the bucket without the extension?
  3. Is there anyway to download the file without the .br extension?
  4. If there is no way for us to access the file without the extension why should the documentation even tell us it is removed?
  5. Please could you show us the code that handles the detection so that we can help improve the documentation?

I believe all these questions have been answered already in this thread but I'll recap:

  1. If a file name ends with .br then the content encoding header is set to brotli
  2. No files are renamed during the unziping process, all files are put into storage exactly as they are named in the archive you upload
  3. no
  4. Because that is how the content-type header is being determined, mime type detection via the extension after the .br extension is stripped
  5. https://github.com/itchio/zipserver/blob/master/zipserver/archive.go#L326