jquery / codeorigin.jquery.com

jQuery CDN
https://releases.jquery.com
Other
57 stars 605 forks source link

Dockerfile: Various bug fixes, clean ups, and an integration test #79

Closed Krinkle closed 2 years ago

Krinkle commented 2 years ago

TLDR:

Differences in file headers

In summary:

Status quo:

krinkle@wp-03:~$ curl -H 'Host: codeorigin.jquery.com' -i 'http://localhost/jquery-3.0.0.js' | head -n20

Server: nginx
Date: Mon, 16 Aug 2021 23:12:52 GMT
Connection: keep-alive
Access-Control-Allow-Origin: *
Cache-Control: max-age=315360000
Cache-Control: public
Content-Length: 263268
Content-Type: application/javascript; charset=utf-8
ETag: "28feccc0-40464"
Expires: Thu, 31 Dec 2037 23:55:55 GMT
Last-Modified: Fri, 18 Oct 1991 12:00:00 GMT
Vary: Accept-Encoding
Accept-Ranges: bytes

Previously, without this patch:

dockerize$ curl -i 'http://localhost:4000/jquery-3.0.0.js' | head -n20

Server: nginx/1.21.1
Date: Mon, 16 Aug 2021 23:13:30 GMT
Connection: keep-alive
Content-Length: 263268
Content-Type: application/javascript
ETag: "5fdbec22-40464"
Last-Modified: Thu, 17 Dec 2020 23:39:14 GMT
Accept-Ranges: bytes

Match existing Nginx config

Most of the above was addressed by simply using the same Nginx config as we have in the private infrastructure.git repo for provisioning legacy codeorigin (copied from wordpress-header.conf.erb and wp/jquery.pp, respectively).

However, the last two points (charset, and unstable etag), require additional changes.

Nginx: charset

The default charset from Nginx differs between Debian and Alpine, possibly due to changes in some of the shared libraries. Fix this by explicitly setting charset utf-8. Note that this does (and should) only apply to files with a MIME type in charset_types list, which by default contains HTML, JS, and CSS.

More importantly, it does not (and should not) cause PNG files and other binary assets to be served with a charset, which would be wrong.

Nginx: E-Tag and Last-Modified

This was a tricky one. Git does not store file modification timestamps, which means the on-disk file mtimes are somewhat arbitrarily set to when the file was created by the local Git client. For most files, this means the time you cloned the repository, and for files added later, the time you ran git pull.

This was de-facto "stable" on the legacy server because it is a persistent server with a persistent git directory (we always use the same clone and just update it to fetch new files after a commit happens). The timestamps themselves don't matter as long as they remain constant for any given file, and they were.

This is important because E-Tag is computed in Nginx based on file mtime and size and (unlike Apache) this behaviour is not configurable in Nginx (e.g. to make it compute a content hash instead). And Last-Modified naturally uses the mtime as well.

As a workaround, I've added a touch command to assign all CDN files a fixed timestamp far in the past. This is safe for us because all files are expected to remain static. Even if we would change a file, it wouldn't propagate to the CDN without a manual CDN API purge (given the all-important high values for Cache-Control max-age and Expires), and after that would not propagate to any browser that has previously fetched it unless the user clears their own cache (given the all-important Cache-Control telling the browser to re-use the file blindly, which is critical for performance). As such, whether the timestamp remains constant or changes after the file is changed, would make no difference for how it propagates.

New state

After, with this patch (and verified by a simple test suite).

dockerize$ curl -i 'http://localhost:4000/jquery-3.0.0.js' | head -n20

Server: nginx
Date: Mon, 16 Aug 2021 23:49:43 GMT
Connection: keep-alive
Access-Control-Allow-Origin: *
Cache-Control: max-age=315360000
Cache-Control: public
Content-Length: 263268
Content-Type: application/javascript; charset=utf-8
ETag: "28feccc0-40464"
Expires: Thu, 31 Dec 2037 23:55:55 GMT
Last-Modified: Fri, 18 Oct 1991 12:00:00 GMT
Vary: Accept-Encoding
Accept-Ranges: bytes

Misc changes

Ref https://github.com/jquery/infrastructure/issues/554.