The purpose of this document is to form consensus around the technical decisions related to distributing and deploying automatic updates for PHP-based CMS software such as Drupal or Backdrop. There's a large bin of quality ideas on this topic at https://www.drupal.org/node/2367319, but the lack of consensus is probably hampering progress. My hope is that a single document updated via github pull requests will provide a clear picture of the plan and a controlled structure to amend it as better ideas are suggested.
If you just have a quick observation or thought, post an issue here or a comment back at the d.o issue; if you have a revision to suggest, please feel free to submit a pull request. If you would like to help take a leadership role in shaping this document, ask me about commit access. Although there is currently only one committer this is not an intentional dictatorship. Revisions will be evaluated for merging based on whether they further the objectives outlined in the design concerns. The design concerns are of course themselves up for amending as well...I've tried to gather all the distinct points that were raised in the d.o issue.
The document is written in Markdown, if you're looking for an easy way to edit markdown for a PR try dillinger.io.
Automatic updates are envisoned as a tool to mitigate the impact of security vulnerabilities in the CMS software, particularly those where sites would otherwise be compromised via automated exploits.
The objective of this design is to address the below concerns:
The technical design can be divided into Deployment and Distribution components. Distribution refers to the process of getting the update package to all the servers that host CMS websites. Deployment refers to the details of installing the update to a given site once it is received.
In short, the Deployment will be accomplished via a PHAR archive signed with OpenSSL that is fully self-contained and secure, independent of security features built into the Distribution component. The Distribution will addiitonally utilize its own security features to ensure messages received are authentic and combat abuse arising from false messages received in volume.
In order to receive automatic updates, a site must opt in to receive them and register with a web service:
For the endpoint to ensure messages being recieved are from the trusted central infrastructure, the Hawk HTTP authentication scheme will be used. Under Hawk, all messages exchanged between the central infrastructure and the site are be sent along with the message's sha-256 hash and an HMAC result computed over the hash using the shared secret key established during registration. Hawk also incorporates features to combat replay attacks.
When an automatic update event is launched, the following exchanges occur between central infrastructure and each registered site:
Updates will be packaged as self-contained executable phar files signed with OpenSSL. Although they will be pushed by the distribution mechanism and could subsequently be executed by an automatic phar runner (see below), they are not tightly coupled to any specific distribution method and could be used as an alternative and safe way to manually apply an update as well, should site owners choose not to allow automatic updates.
Using Phar archives as the basis for packaging updates has these advantages:
composer install
, anyone?)Update packages will run in three distinct phases. Normally the phases will all be executed in order, but the package will also support manual execution, and advanced users may choose to run the phases separately. In the first phase, the update is verified to be applicable to the site it is attempting to be applied to by comparing version numbers. In the second phase, updates to the code tree on the filesystem occur (file adds, modifications, and/or deletes.) In the third phase, a post-update script is run to perform tasks such as invoking database schema updates.
Although the phar packages will, strictly speaking, be directly executable at the command line, this should not be a recommended/documented primary way to run them. The main reason for this is that it is impractical for users to verify that the particular phar they are about to execute contains an OpenSSL signature at all. When it does not, the phar will be executed by the interpreter without question, so anyone in a position to tamper with the package could easily defeat the digital signature protection by simply removing it from the archive. (The extension ".phars" should totally be a thing, but that's another topic altogether.) Secondarily, in order to execute a properly signed phar from the command line, one needs to first copy the public key to a file in the same directory as the archive, named identical to the archive with .pubkey append, and that's just clunky.
Instead, the recommended ways to open and execute phar update packages will be by calling a small bit of previously installed trusted code, passing it the phar file. This code could be invoked through drush or other command-line tools, as well as directly by the CMS. It will:
This model makes it as simple as possible to kick off the update code in a variety of ways with necessary OS permissions.
A separate module, the Trusted Installer, will manage execution of phar archives such that they are installed onto a site. Splitting this process into a separate module offers the potential to offer future support for one-click installation of contrib modules by site owners, without having to have their sites writable by the webserver and without having to deal with sftp credentials.
Fundamentally, the Trusted Installer will be a public key repository, an API to run a phar on a particular site, and two implementations of the phar runner. One will target setups where the webserver is allowed to update the CMS code directly (WordPress' automatic updates shows us this is prevalant among shared hosts), and another will target Unix-like OS's where the webserver does not have sufficient permission to modify the CMS source tree.
The in-webserver runner should be trivial.
The runner for the case where the webserver does not have permission to modify the CMS source will have these components:
Under this model, the server administrator would need to do a one-time setup of the daemon, configuring it to run under a user allowed to edit the CMS source tree(s). One such daemon should be sufficient to update multiple CMS installations if desired by running as a priviliged process that forks and makes setuid/setgid calls.
The key observation from a security standpoint is that the process with permission to adjust the site's source tree first performs its own verification that the changes to be made are from a trusted source.