Closed wthayer closed 4 years ago
Apologies for the delay here Wayne, I'll chase down cr when she is back and we will investigate implementing this change.
TLS Canary relies on mozilla/OneCRL-Tools to fetch revocations updates, so the first step would be an update there to support the new distribution method. I'm in touch with @mozmark to coordinate the changes.
Let me outline how it currently works:
oneCRL2RevocationsTxt
in https://github.com/mozilla/tls-canary/blob/c0ef8c30e404af6088c59c79e3b548f5578ea410/tlscanary/one_crl_downloader.py#L17 with the desired configuration.revocations.txt
output file and puts it in the profile directory.The proposed change is to completely separate OneCRL logic from Canary logic by amending OneCRL-Tools with a tool that directly updates the profile directory instead. TLS Canary's involvement with the process would be reduced to:
oneCRL2Profile
with the desired configuration and a path to the profile directory.And then OneCRL-Tool takes care of updating the profile in whatever way is right, and the Canary wouldn't have to care how it works.
But again, this is just my proposal and it still needs to be coordinated with the OneCRL-Tools dev.
Howdy @cr,
I work with Kathleen and Wayne over at the CCADB where we recently deployed a tool that deserializes revocations.txt
, OneCRL, Kinto, and cert_storage
into a shared intermediate representation for the purpose of diffing. So I think I may have the relevant state to help out here.
First, the broad stroked top level requirements. We would like a tool that:
cert_storage
format into the correct spot in the given FF profile.Is that a fair shake?
In terms of implementation complications, I gave it a quick scoping glance this morning and came across some stumbles. So the FF cert_storage
is an LMDB. The tool I mentioned before that we deployed at the CCADB is written in Rust and does a fine job of reading this data. However, I see that get_list
is cloning, checking out, and building the tool on the fly, hence the TLS Canary's dependency on the Go compiler. This fine, although the Rust compiler would be a fairly hefty additional dependency for the TLS Canary. Now, there is a Golang LMDB binding, however it is just that - a binding to the underlying C libs, which means that this is no better than Rust as using this library would add a C compiler as a dependency (in addition to God knows what other C objects may be necessary, making reproducible client builds difficult).
Long story short - would you folks be amenable to adding a Rust compiler as a dependency? Or perhaps I can work with the OneCRL folks to have tagged released with prebuilt binaries (@jcjones)?
Hi Christopher,
that's a super interesting new angle to the issue. Can you point me to source code for your tooling?
Is that a fair shake?
Yes, at least that's part of it. TLS Canary needs to support Firefox versions all the way down to ESR, so its OneCRL backend tooling needs to handle the profile in the way the specific Firefox version requires, which would include falling back to revocations.txt where necessary.
would you folks be amenable to adding a Rust compiler as a dependency?
Your analysis of how the canary uses golang is precise. As trivial as that Go-on-the-go code in get_list()
may look, it took a major effort to stabilize it across multiple Go versions on multiple platforms. Our main targets are Ubuntu LTE/AWS, MacOS, and Windows 10, and frankly, I am quite surprised at how stable it has been so far. Doing the same for cargo and Rust would likely be a relatively big project as well, so that would require some planning and coordination I can't make on the spot.
Switching to C, by the way, would be way worse than either Go or Rust, as the canary would need to implement building with various compilers and toolchains to retain cross-platform compatibility.
I'm beginning to wonder how hard a re-implementation in python would be for this specific purpose. There's plenty of lmdb bindings to choose from for python, but I don't know how involved the OneCRL parts would be.
Let's also pull @mozmark into this discussion who has snuggled with the OneCRL-Tools code before.
So I can definitely see the pain that may have occurred when trying to wrangle go get
as the TLS Canary is dependent upon Go 1.7, which did not have first class dependency vendoring at the time and required $GOPATH
shenanigans.
I will say that in Rust (specifically cargo) it is far easier to get out-of-box reproducible builds and dependency satisfaction. Really, a single cargo run
should suffice.
WRT to what it would take to do this in Python we need:
cert_storage
requires bit-whacking of raw binary data (see this tomfoolery for a whiff). It's been awhile since I've programmed Python sincerely, however my recollection is that Python does too much "magic" for you when you're trying to deal with raw binary fields (auto conversion to a BigInt type upon overflow, for example). This excessive magic may-or-may-not be a problem. We'd have to see, but it is certainly possible.Irrespective of specific implementation, this is going to need a strategy for testing any changes made as I need to be sure that data inserted into cert_storage
is actually taking effect for local development. @jcjones would it be as simple as inserting the leaf from, say, mozilla.org in my local Nightly and confirming that FF throws a fit when accessing the page?
Howdy @cr, I got some rust code on my local working this morning that can insert into cert_storage and see revocation warnings show up in FF nightly. I'll give LMDB in Python a shot, though, as you are right that would make dependency management easier and eliminate the need for shaky subprocess calls. Worse comes to worse, we do have a chunk of code that works, however.
I don't have strong feelings about introducing Rust dependencies; our tooling will likely depend increasingly on rust crates (for sharing implementation with new code in Firefox) so it's a cost we'd eventually have to swallow (even if dependency issues are a problem in the short term).
Well that certainly simplifies things, especially since my attempt at using Python's LMDB implementation resulted in mysterious internal explosions from the library that I would rather not deal with.
I'll flesh out the rest of the tool over the weekend. Of course, things like clap
, serde
, and reqwest
are great for stuff like this, but it looks like the TLS Canary is pulling a building from a clean state so I'll try to keep compile times low.
In bug 1429796, the use of revocations.txt to store OneCRL revocations is deprecated and replaced with the cert-storage database. This change is blocked until some crashes are fixed. Once it goes to Release, we will need TLS Canary to be updated to do the right thing when a custom list of revocations is sent to it via the -o flag, so that we can continue to test OneCRL updates before they are deployed.