Open TobiX opened 1 year ago
I'm thinking it could be a new boolean option called precompressed_allow_no_base
for a lack of a better name.
How about prefer_precompressed
?
I'm not really sure what all is needed for this and might need help. But I wouldn't mind taking a stab at this.
@shade34321 take a look at the fileserver code and at the docs. Look for precompressed
in the codebase for the key places to look at.
I've taken a look and I think this is going to take me a bit longer to understand before making a fix. So if someone else wants to work on it they can but I'll keep chugging along until I see a fix. Just wanted to update everybody.
Would like to take this up !
Sounds good! Let us know if you have any questions.
I have a question. What approach does caddy currently take, for clients that do not support compression ?
would be better if i can connect on call with a person who can explain about this. I'm not sure if this is common in open-source communities. But i feel it is essential.
We follow the Accept-Encoding
header. If a client doesn't declare it can handle compressed content, then Caddy will not compress it.
For file_server
's precompressed files, that's why it's useful to have an uncompressed copy on file (most of the time).
This feature request would require decompressing the content before sending it if the client doesn't support compression, if the "no base" option is used.
I have a question. What approach does caddy currently take, for clients that do not support compression ?
I suggest you just read the code and maybe run through it with a debugger, starting here: https://github.com/caddyserver/caddy/blob/b166b9008344786e7cfe5715b7e8a4cd3ec1bd76/modules/caddyhttp/fileserver/staticfiles.go#L233
More specifically look at this part: https://github.com/caddyserver/caddy/blob/b166b9008344786e7cfe5715b7e8a4cd3ec1bd76/modules/caddyhttp/fileserver/staticfiles.go#L357-L393
The call to encode.AcceptedEncodings
normally returns an ordered list of accepted encodings in the request (Accept-Encoding
) and then checks if they are present in the file system. Since you ask for the case of no compression, this loop will be skipped entirely. That means the variable file
will still be nil
, which makes you enter this case and open the actual file to serve:
https://github.com/caddyserver/caddy/blob/b166b9008344786e7cfe5715b7e8a4cd3ec1bd76/modules/caddyhttp/fileserver/staticfiles.go#L395-L411
Sure. Thank you all for adding more information ! Will look at it.
But i don't understand why we need a new boolean option called prefer_precompressed
for this. If caddy finds an sidecar/precompressed file that does not have it's corresponding uncompressed file, we can decompress it on the fly (for clients that do not support compression) without the need for a boolean option right ?
@titanventura did you read the discussion in the community forum thread? The reasoning is covered there. And reading the code, it should be clear why we do it that way currently.
I feel this feature can be implemented without prefer_precompressed
option. Because, we can implement it in a way so that, if the uncompressed file is NOT present and the client does NOT accept the encoding, we can decide with decompressing the file on the fly since we have no other option to serve the client right ? Please correct me if i'm wrong, im a newbie to the open source realm. I would love to get on call to discuss things in more detail (if that works with you).
As I wrote before, currently the code is set up such that if the original file does not exist (checked first), a 404 is immediately returned and no additional system calls are made. This is optimal to "fail fast" and assumes that no compressed version of the file will exist.
If we change the code (without a new option) then additional system calls would be made on all requests to missing files which can be seen as a performance regression, especially if someone is making tons of requests to brute force trying to find some files (but admittedly in that case you should probably have some rate limiting in place to stop that from happening).
My argument is that most users of precompressed will probably have both versions on disk, as is commonly done when using JS build tooling to precompress assets. So I think that performance regression to allow for not having the uncompressed copy should be opt-in.
Ahh I get it now. This has made things much more clear. Thank you so much @francislavoie . Will try and work on it now that i've understood the issue !
Hi @francislavoie @mholt I would love to spend some time solving this. Sharing my understanding below of the above discussion before I start to work on it,
Issue: If there's only a compressed copy of file on server and and the client doesn't accept the compressed file, it returns 404.
Solution:
We can allow decompressing of the compressed files on the fly if the precompressed file doesn't exist and the client doesn't accept the compressed files.
Issue: It can lead to performance regression issues.
We can make it as an opt-in feature so that the server administrator can specify that he wants to decompress and serve files maybe for specific routes (if the precompressed file doesn't exist) and keep returning 404 if not opted for assuming that the prescompressed file must exist.
Please correct me if I have some flaws in understanding.
Sounds correct @singhalkarun.
Sounds correct @singhalkarun.
Thanks for confirming @francislavoie . I am starting up on this. I will keep you posted.
I'd love to get this one done. I already started working on it.
As discussed in https://caddy.community/t/precompressed-files-without-base-files/17330, I was thinking about an extension to the precompressed feature: It would be nice to only have compressed files on disk and not have to keep the uncompressed files around...
Pro: Most clients support compression nowadays, so nothing changes for most clients Contra: If some client not supporting compression comes around, Caddy need to decompress the file before delivering it to the client
@francislavoie suggested in the community thread that supporting this feature might impact performance due to the additional stat calls, so we probably need to keep that aspect in mind, too...
PS: While thinking about it a bit more, I could probably implement it as a custom file system which "synthesizes" the uncompressed files :laughing: