JoeGandy / ShareX-Custom-Upload

A little PHP script created for uploading custom sharex files to your own webserver
MIT License
166 stars 50 forks source link

Upload caching #100

Closed davwheat closed 4 years ago

davwheat commented 4 years ago

I think it'd be very beneficial to enable caching with an ETag.

https://github.com/JoeGandy/ShareX-Custom-Upload/blob/6571ae901bb1c006a0bc082ec7792d3835302b66/src/rewrite.php#L60-L62

Adding these lines below between L60 and L62 perform wonderfully. Can't do a PR right now, sorry!

This way, if a file changed, the updated file would be served as the ETag (SHA256 hash of the file)

header('Cache-Control: public, max-age=360, stale-while-revalidate=120');

$etag = "test";
$etag_algo_used = "";

$time_start = microtime(true); 

if (PHP_INT_SIZE === 8) {
    # 64-bit, so we use SHA512 for faster performance
    $etag = hash_file("sha512", $file_path);
    $etag_algo_used = "SHA-512";
} else {
    # 32-bit (or other?), so we use SHA256 for faster performance
    $etag = hash_file("sha256", $file_path);
    $etag_algo_used = "SHA-256";
}

$time_end = microtime(true);

header('ETag: "'.$etag.'"');

# get time difference in ms
$time_diff = round(($time_end - $time_start) * 1000, 4);
header('X-ETag-Generation-Milliseconds: '.$time_diff);
header('X-ETag-Algorithm: '.$etag_algo_used);
theaquarium commented 4 years ago

I really like this idea! Thanks for suggesting it and for writing the code. I think it's worth adding as an optional config option, but it's definitely worth having! I'll open a PR to add this.

theaquarium commented 4 years ago

Hey, quick question. What's the point of providing the ETag generation algorithm and generation time. Those aren't standard properties and ETags should really be used as opaque identifiers—just a random meaningless string.

davwheat commented 4 years ago

What's the point of providing the ETag generation algorithm and generation time.

Debugging. If someone has a very low power server, like a Pi 2, they might wonder why the server takes a while to respond. This could help diagnosis of long TTFBs.

It'd be a good idea for this to be togglable in settings. I like it because I'm curious and like providing extra info when I can, just because... why not?

Those aren't standard properties

Any header starting X- is considered a custom header. X-Forwarded-For isn't standard, but large companies like Cloudflare use them.

ETags should really be used as opaque identifiers—just a random meaningless string.

Not at all... Check MDN. https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag

The method by which ETag values are generated is not specified. Often, a hash of the content, a hash of the last modification timestamp, or just a revision number is used.

Bearing in mind that the user is using the site to get the hosted content, providing a hash of the content isn't bad at all. They could get a hash by downloading it themselves anyway.

theaquarium commented 4 years ago

From the MDN page on HTTP caching:

The ETag response header is an opaque-to-the-useragent value that can be used as a strong validator. That means that a HTTP user-agent, such as the browser, does not know what this string represents and can't predict what its value would be.

But I'm not here to argue, you make a fair point. By non-standard I meant that I can't find anything that uses it (like a caching server), but yeah I guess it's nice to have more information. I'll send generation algo by default, but I think the generation time could be in something like a debug mode, as you said. Debug mode could potentially be used in the future for adding other verbose information, but sending hash generation time by default seems like a bit of a waste of bandwidth.

davwheat commented 4 years ago

I mean the browser doesn't know what the ETag represents, unless it starts doing hash cracking lol. That's the way which I interpreted it: the browser doesn't know what it means, not that we shouldn't send something that isn't arbitrary.

I agree about the statements about debug mode!