Open Liz315 opened 7 years ago
From discussion with @meejah and @crwood, here's a rough outline of the steps:
Failures can happen, though.
Spiffy extras:
More failure modes. In addition to the signup or wormhole servers crashing, they might also lose their connection to each other which amounts to the same thing. The connection must remain open until the exchange is complete or it most likely will fail to convey the configuration to GridSync.
Revised signup flow:
At this point a few things can happen.
If time passes and the wormhole code expires, the only possible interaction which leads to any state change is the user reloading the page displaying the wormhole code.
If the user reloads the page displaying the wormhole code,
At this point, the user is back to the same state they were in after they initially completed signup.
If the user connects to the wormhole and picks up the configuration data:
The subscription is now considered active and further signup-related interactions are not expected to take place (and none are allowed which make further state changes).
If an attacker connects to the wormhole, mis-guesses the secret, and trashes it:
At this point, the user can re-load their page to get a new wormhole code and try again.
If an attacker connects to the wormhole, correctly guesses the secret, and picks up the configuration data:
At this point, the user is back to the same state they would be in if they lost their wormhole code or let it expire. They can reload the page to get a new wormhole code and try again.
Practically speaking, the bits of the above flow that involve creating a wormhole should actually move out of the signup webserver. It effectively needs its own convergence loop so that it can react to database state changes and recover from restarts. The database will end up being used as an RPC mechanism so that the signup webserver can spit a wormhole code out at the user.
Going in to some detail about that whole "move out of the signup webserver" comment:
The interactions here where the subscription manager database is used as RPC between the wormhole invite agent and the signup web server do not make me particularly happy. Probably the thing to do instead is something like:
This is in addition to keeping the wormhole code in the database and polling that database (because we still want to be able to recover from process restarts). It will significantly reduce the polling interval required to provide a good experience. It is more complexity and it is basically an optimization ... but it seems like a necessary optimization. We don't want the user waiting for minutes at the signup web form. However, if we go back to using email then potentially this is less of a concern since waiting a minute or two for the signup email is a more familiar experience. But that does require that we start sending emails again...
Overall, the whole process here is pretty complicated. There are a lot of moving parts and a lot of steps to get through the whole thing. One simplification would be to keep the wormhole invite agent inside the signup web server, where it is now. We could presumably still have process restart recovery via the exact same mechanism (ask the database) but without all of the extra RPC. This is probably something to consider (it throws a monkey-wrench into the implementation plan for the wormhole invite agent (to ues Haskell - because the signup web server is Python - possibly the whole thing should be ported to Haskell?) but the monolith is still probably less complexity than this orchestra of microservices).
What are the other options?
e.g. I imagine if it was "use email" a lot of the above would be substantially the same (except for the "notify the user" part becomes "send email" instead of "render a magic-wormhole code on a web page")..?
(I guess all I'm saying is: it's not obvious to me what a "way simpler" thing looks like?)
"Use email to deliver the wormhole code" would be a significant simplication, I think. That removes a lot of the interaction between the signup web server and the wormhole invite agent. The signup web server, after a successful POST to the backend, could just say "Okay you signed up check your email" and be done. The wormhole invite agent could send the email when it allocates the code. It still needs to persist it with the subscription manager but at least it doesn't have to get it back to the signup web server.
This isn't the direction I had been thinking but it's definitely worth considering.
Another simplification would be offering the user a download directly from the signup web server. This would go something like:
This avoids any new cross-process interactions on the backend (signup server still has to populate subscription manager but that's done already). If we make the download time-limited (or even single-use) then we retain security properties that at least superficially resemble those of magic-wormhole.
I alluded to some of these points in an earlier email, but for the sake of transparency (and since the conversation is now happening here -- which is great!), here are some of the reasons why I think providing "a pre-configured Gridsync" (which I interpret here to mean a downloadable binary distribution with the customer-specific fURL already burned in to the package) would be a Bad Idea:
It arguably violates the Principle of Least Authority: Customers would suddenly need to trust their service provider to not ship malicious binaries to them, whereas with the current model, customers can at least choose to build the application on their own from source (or install it, e.g., from a future Debian repo) and sign-up for S4 independently. Indeed, requiring the customer to additionally depend on the provider to build and ship "custom" applications for their machines -- as well-intentioned as that may be -- arguably removes one of the primary "selling points" of using Tahoe-LAFS-based storage/S4 in the first place: namely, that the storage provider need only be depended upon to ensure the availability of ciphertext. Consider, also, how violating this principle could make Least Authority (the company) target to a new host of attacks (since pwning your build infrastructure makes it possible to pwn your customers) including, potentially, legally mandated ones..
It removes the additional authenticity/integrity checks afforded by PGP signatures: If every downloaded application is being custom built such that the resultant artifacts differ, there is no quick and easy way, say, for Customer-1 to verify that their downloaded application functions identically to that of Customer-2, or that it hasn't otherwise been tampered with, etc. PGP sucks, yes (and I grant that there are some workarounds here -- like signing hashes of the files that wouldn't change, I suppose) but SSL/TLS and the hot mess of certificate authorities arguably shouldn't be the only way to verify the authenticity/integrity of a downloaded package.
It excludes "vanilla" Tahoe-LAFS users: There are plenty of reasons why S4 customers might prefer to use the standard Tahoe-LAFS CLI over Gridsync (e.g., for tahoe backup
-centric use-cases). I'd argue that we should strive to support these users (especially considering that the next version of Tahoe-LAFS will seamlessly support grid-invites over magic-wormhole via the tahoe invite
set of commands -- which might open up new opportunities for new customers); interoperability and giving customers the choice of which client to use is arguably preferable to client lock-in (and might even help to encourage the development of new and better tahoe clients in the future).
It will require maintaining and/or paying for additional hardware: we'll need a Mac to make/ship Apple Disk Image (.dmg) files, and, if/when the time comes, a Windows box/VPS to repack MSI/NSIS/Inno Setup installers. These will need to be integrated into the existent horde of microservices in some way that doesn't suck, will further increase attack surface, and will probably be a huge pain to set up and maintain.
All that being said, I, at least, would be strongly in favor of maintaining a wormhole-centric setup flow: setting aside the already-sunk costs, the security properties are great and the overall configuration UX is at least considerably better than it was before (plus exposing users to a wormhole code on first-run importantly helps to familiarize them with using the same mechanism later for adding additional devices, sharing magic-folders, etc.). Beyond that, it's also something that, as far as I know, is wholly unique to S4 and was well-received in past user-testing (after participants got over the initial conceptual hurdles and understood how it actually worked..). Neveretheless, I recognize that @exarkun's time is both limited and valuable as-is such that sending the invite codes over email (rather than relaying back to the signup server) sounds to me like a reasonable sacrifice or trade-off to consider.
Failing that, Gridsync can load the received/downloaded configuration settings as a file: the (unencrypted) "recovery file" format is just a simple JSON dump of the settings received previously through the wormhole (with the addition of an optional "rootcap" field that gets added later and wouldn't apply here). I like this option considerably less than wormhole-over-email, however, as it introduces additional steps for the user, potentially increases the risk of exposure, and results in a clunkier or more confusing UX (since they'll be asked to export another file shortly after loading the one they just did and may mistakenly think that the second one -- which actually contains their freshly-generated and very-important rootcap -- is unnecessary if they keep the first). There might be other ways around this, however... When I first started hacking on Gridsync, I experimented with a gridsync://
URI format (that I originally attempted to document here) which, I hoped, would provide "one-click" access to various tahoe resources. It obviously didn't pan out (IIRC, custom URIs required some convoluted Windows Registry stuff and/or administrator access to register), but it would be pretty nice if a user could just click a gridsync://
(or tahoe://
or lafs://
) "link" in their browser to have their already-installed tahoe client join that grid (or magic-folder).. Perhaps I should revisit this..
I definitely agree that training users to do anything besides "get the software from The One True Place" should be avoided.
How about something like this? This changes the interactions after Subscription Manager has made the "pending" reply. All lines are some kind of RPC (e.g. could be HTTP requests). The Subscription Manager here is the only thing that modifies the subscription database -- it syncs before sending the "pending" back, and syncs after (and before?) allocating the wormhole from the agent. Ties the "wormhole invite agent" in, and doesn't use the database for pub-sub (because Subscription Manager explicitly calls the Wormhole Agent).
(Hmm, trouble pasting files?)
seqdiag {
"client" -> "web server" [label="GET"];
"web server" -> "subscription manager" [label="deploy", leftnote="set cookie"];
"web server" <- "subscription manager" [label="pending: 'id'", rightnote="sync db"];
"client" <- "web server" [label="doing stuff"];
=== Arbitrary time could pass (e.g. user finally downloads software and clicks 'ready for code' or something) ===
"client" -> "web server" [label="get code"];
"web server" -> "subscription manager" [label="get code: 'id'"];
"subscription manager" -> "wormhole invite agent" [label="alloc", leftnote="sync db"];
"subscription manager" <- "wormhole invite agent" [label="got code: 1-foo-bar", leftnote="sync db?"];
"web server" <- "subscription manager" [label="got code: 1-foo-bar"];
"client" <- "web server" [label="got code: 1-foo-bar"];
=== Client downloads GridSync ===
"gridsync" -> "wormhole invite agent" [label="open wormhole: 1-foo-bar"];
"wormhole invite agent" -> "subscription manager" [label="wormhole opened", leftnote="sync db"];
"wormhole invite agent" <- "subscription manager" [label="JSON"];
"gridsync" <- "wormhole invite agent" [label="deliver JSON"];
}
To build, pip install seqdiag
and seqdiag the_above_file.diag
should spit out a similarly named PNG file. Which github won't let me attach :/
Change S4 sign-up process to use LA’s magic wormhole server (not fURL) See: https://github.com/gridsync/gridsync/blob/master/docs/invite.rst
The current signup process involves tahoe configuration parameters emailed to the user. The shortcomings of this approach are:
It should be noted, though, that the cleartext tahoe configuration transferred this way only grants access to use the storage server. It does not convey the ability to read or write any data the legitimate user has uploaded to that storage server (doing so requires the caps for that data - those caps are not part of this exchange).
A magic-wormhole-enabled signup in which the tahoe configuration is conveyed to the user through a wormhole addresses some of these.
However, the use of wormhole codes involves operation of a wormhole rendezvous server. There are some operational concerns with doing so. These are discussed in the wormhole documentation itself.