Closed dlorenc closed 11 months ago
Let's get a rough plan of the end goal and what it's going to do, then figure out how to get from here to there!
Somewhat stream-of-thought notes based on the community call yesterday:
Terminology note:
sget-go
refers to the Go code in this sigstore/cosign, in cmd/sget -- we think we want to phase this out.sget-rs
refers to the Rust code currently in sigstore/sget -- this currently focuses on OCI fetching + policy; we think we want to remove OCI functionality from this at least.sget-new
refers to some yet-to-be-built tool that focuses on fetching URLs, verifying signatures+policy if present, checking Rekor, etc. -- focused on replacing curl | sh
use cases. I think we all agree this is what we want to build.First, come up with a plan that answers the questions:
sget-new
be written in? (Rust vs Go, or something else)
sget-new
need to do to be considered ready for release into the wild?
curl
for that todayFinally, with that plan in place (not yet done, just agreed upon):
sget-go
's deprecation
sget-go
from the repo (or rename it)sget-go
from any downstream packages that distribute it (Alpine, maybe others?)sget-new
is not in Rust, or is easier to build from the ground up in Rust:
sget-rs
's deprecation, remove the repo or remove the code from the reposget-rs
adds URL fetching, removes OCI fetching, sget-rs
== sget-new
, eventually becomes MVP and releasedAnything missing? Anything horribly incorrect?
Oh! Another point:
If cosign
wants to support fetching a blob from OCI, verifying signatures (+policy?) and printing it to stdout, that seems entirely within its newly staked out scope, different enough from sget
not to be too controversial. π€
So maybe sget-go
just migrates its code into cosign
instead of being deleted entirely. The standalone sget-go
tool could still be removed, or just become a verification-only client for fetching OCI blobs. Its name should probably change though... at that point, anything cosign
wants to do to organize itself is up to cosign
.
Just a note: If sget-go is deprecated/deleted and OCI features are removed from sget-rs (or sget-$lang), you won't have any way of retrieving blobs / signing materials from an OCI registry. This might be OK, but keep in mind the cosign policy
code is present as well to work with the policy validation code that is in sget-rs https://github.com/sigstore/sget/blob/main/src/policy.rs
I think this is missing the "what should sget-new do" part still.
A skeleton command line surface or mock showing how a user would interact with it, and what checks it should perform. If we start with that, then we can figure out the best language and repo for the tool to live in.
Just a note: If sget-go is deprecated/deleted and OCI features are removed from sget-rs (or sget-$lang), you won't have any way of retrieving blobs / signing materials from an OCI registry. This might be OK, but keep in mind the
cosign policy
code is present as well to work with the policy validation code that is in sget-rs https://github.com/sigstore/sget/blob/main/src/policy.rs
Fetching-verifying-policying blobs in OCI could move to cosign
, https://github.com/sigstore/cosign/issues/1363#issuecomment-1022324509
I think this is missing the "what should sget-new do" part still.
"Replace curl
for piping into sh
" seems like the consensus, at least to me:
- curl https://my.site/install.sh | sh
+ sget https://my.site/install.sh | sh
What it does behind the scenes to make that a safer alternative to curl
, and how drop-in it could be in all cases, still TBD -- ideas welcome! -- but that's the goalpost I'm proposing for now.
Or: if we think we want sget-new
to include OCI-fetching and npm-fetching and anything-fetching, then it seems like a better path forward in that case is to (1) write that in Go, sharing the wealth of existing code, and (2) make it a fetch-only variant of cosign
, which could also gain these powers.
In that future, it's not sget <url>
, it's probably cosign get-url <url>
or something less catchy -- but that's what alias
is for π.
What would be the storage medium here if OCI is out of the picture?
I know you like the idea of http @imjasonh , but it's a sticky one to get working with signing materials (how to map artifact to signature , websites get hacked a lot). I honestly love the git method (I even prefer it over OCI), but unfortunately the very meagre rate limits github / gitlab have for API access means it won't work on any project that see's a decent amount of hits, so that's dead in the water now :(
This is feeling like we need another meeting :) What do we think about trying to set up an hour next week to get in sync around the long term vision? I think we're close but not on the exact same page.
ideas welcome!
Here's one idea for what a flow might look like for the MVP.
The user copies a command from a project README and runs it.
$ sget https://my.site/install.sh | sh
sget
acts as an HTTP client to download the resource (and follows a normal TLS validation process when using HTTPS, just like curl
does).
Before sending any bytes to stdout, we verify the bytes we just retrieved. To do this, we:
Send the entire payload to stdout, such that the user can pipe those bytes to a command like sh
as needed.
IMHO, this is the hardest and most important piece of the user experience. I don't fully understand the TUF-like system that was proposed earlier, and I don't think we should expose something like TUF to users under normal circumstances (if ever), but I do like the idea of using OIDC IDs (e.g. dan.luhring@anchore.com
) as a way for the user to specify trusted identities.
I'm wondering if we need some kind of local state on the user's machine that specifies the identities that the user trusts. I'm not sure what/who these identities should be, and how they should be represented. Maybe we can prompt the user for trusting new identities as we encounter them (e.g. exit non-zero, with an error message like "You don't trust the signer; if you want to start trusting them, run command X."). If we're going for mass adoption, I think we should avoid anything too esoteric or difficult to use.
There's very little work for project maintainers to do: sign your artifact (e.g. shell script), such that the signature is sent to Rekor. No other HTTP resources besides the artifact (e.g. shell script) need to exist on the remote server.
And there's even less for the user to do: just run sget
. After the MVP, we can add more configurability to how verification works.
@luhring this sounds great, thanks for writing that up! I think there's still some questions to sort out, but I think that's roughly along the lines of something we'd need for this to work for arbitrary URLs, especially when the maintainers might not have done anything to enable better assurances for sget
users specifically.
I'm wondering if we need some kind of local state on the user's machine that specifies the identities that the user trusts.
This seems unavoidable, basically. Do we need some separation of globally trusted identities and per-site trusted identities? I might trust anything Luke has signed, but might only trust something Dan signs in the context of Dan's site. Or is trust a globally binary state? I'm not sure.
In the absence of trusted identities, can we show a warning like "NNN users have marked this as trustworthy, here are a handful of them: a@foo.com
, b@bar.com
, etc." -- this would tell us that the file has already been widely fetched and marked as okay, by users willing to attach their username to it. It's still spammable by bad actors, but sampling those can help prevent that, if all the sampled identities seem sketchy or share a spammy domain.
Both of these require some flow for prompting users to note their trust of the artifact, which might also be challenging. Once you sget | sh
, how can we prompt the user to tell us that they should put their username into Rekor to vouch for the artifact? sget
could block, show the fetched contents, and prompt for an approval before piping to stdout, but that's a bit of a speedbump if you just want to get to the | sh
part. And it won't work for all in a headless CI mode.
There's very little work for project maintainers to do: sign your artifact (e.g. shell script), such that the signature is sent to Rekor. No other HTTP resources besides the artifact (e.g. shell script) need to exist on the remote server.
I think nailing this will be crucial to adoption, especially early on while sget
has not established its value to consumers or maintainers. Once folks are getting value from sget
with no effort on the part of the maintainers, it's easier to convince them that it's worth adding their signatures and policies to make it even more valuable.
I've been thinking a lot about Jason's comment above. This trust thing is a fascinating problem!
Do we need some separation of globally trusted identities and per-site trusted identities? I might trust anything Luke has signed, but might only trust something Dan signs in the context of Dan's site.
I definitely see value in this distinction (it reminds me vaguely of SSH config: "here are my trust settings for any host, here are my settings for host X, etc."). I think something like this would be good to include. I'm wondering if it should be in the MVP or not.
In the absence of trusted identities, can we show a warning like "NNN users have marked this as trustworthy, here are a handful of them: a@foo.com, b@bar.com, etc." -- this would tell us that the file has already been widely fetched and marked as okay, by users willing to attach their username to it.
I love this! β€οΈ It adds a social component to the world of OSS software installation. It reminds me a bit of GitHub stars. "Oh, I see Dan Lorenc trusts this thing? Okay yeah, I'm willing to trust it, too."
Tangent: It might be neat to have a web UI to showcase this, maybe sourcing data from Rekor, to make things more grokkable for the average user. I'm envisioning a way to browse signed things (e.g. scripts) (indexed by digest and/or content URL). And for each thing, you can see how many identities have signed it, and for each identity, some context about who they are (thanks to OIDC). Maybe you can also see each person's list of things they trust.
If I'm getting too web-of-trust-y, call me out! π
I think it would be helpful if, when sget
-retrievable things are signed, the URL gets included in the signature payload (maybe as an annotation?).
This could help the user verify that the resource author intended for a given resource to be what's provided by a specific URL.
For example, without this in place, the following would be possible:
sget https://my.site/v1/install.sh
-> bytes for v1's script -> signed by the correct identity -> valid (this is expected β
) sget https://my.site/v2/install.sh
-> bytes for v1's script -> signed by the correct identity -> valid (oh no! π± )(In other words, this could help prevent "rollback" and "indefinite freeze" attacks described on the TUF site.)
If the signature payload included the intended URL, sget
could further verify that the provided URL matches the URL sget
was given by the user.
Also, users may want sget
to have the option to explicitly skip this additional validation. Perhaps the user knows that they're fetching content from a mirror/proxy/etc., and they wouldn't expect the URL to be known at time of signing, so they're willing to accept losing this validation.
Another thought, for later on:
I've been thinking about the "requirements" concept in Apple's code signing system. This might relate to the claims/annotations feature in Cosign (although I'm still learning about this feature). This is a way for the software publisher to specify additional constraints for the verifying client to check, in addition to the raw signature being verified successfully. These "requirements" can be things like "the root cert should be Apple's root CA", or "the signing cert should be X", etc.
Down the road, I could see these making sense as optional flags to sget
, especially since sget
commands are probably just copy-and-pasted 99% of the time. For example, a project's README.md might show this command:
$ sget https://my.site/install.sh --required-signer="jason@hall.com,dan@lorenc.com,luke@hinds.com" | sh
This would serve to narrow the scope of what signatures are considered valid. If sget
determines the signature is otherwise valid, but none of the signers listed in the command are the identity attached to the signature, validation would fail.
This wouldn't protect against attacks on the README content. But it might be a nice, simple mechanism for "trusting the right thing for this particular installation". The other advantage is that it could provide a secure approach to a stateless user environment, provided the sget
client does trust whatever root CA was used for the signatures (e.g. Fulcio).
First, come up with a plan that answers the questions:
what language should
sget-new
be written in? (Rust vs Go, or something else)
- why is that language the right choice, practically speaking?
what does
sget-new
need to do to be considered ready for release into the wild?
- not "fully functional", just MVP
- it needs to do something more than just fetch, we have
curl
for that today- who wants to work on this, and where? sigstore/sget exists, modulo language question above.
These sound like a good set of questions to answer tomorrow in the call. I'll suggest we do it in a bit different of an order:
sget-new
, and what are they going to use it for?Then:
I really recommend that before delving into questions such as how will the UX play out, what language, flags etc, we first define and outline what is the current and projected scope of sget-go. Otherwise you risk misunderstandings coming up later on.
I really recommend that before delving into questions such as how will the UX play out, what language, flags etc, we first define and outline what is the current and projected scope of sget-go. Otherwise you risk misunderstandings coming up later on.
I think that's a bit circular - sget-go is on pause until we resolve what's happening with the other sget(s). I can explain what I originally wanted to do here with it, but we're trying to make sure there's only one sget and all of these plans can change to make sure we get one thing everyone is happy with.
So we met, and I think we all in general agreed to the following course of action:
github.com/sigstore/sget
to github.com/sigstore/sget-archived
(name TBD)github.com/sigstore/sget
repo and copy cosign's cmd/sget
code into the new repocmd/sget
as a wrapper around the imported sigstore/sget
codebase, and eventually remove it after any package managers who use that code have pointed to the "real" sget repo.sget
in its own repo, in Go, focusing on URL endpoints
sget
tool will also have functionality to fetch and verify blobs in OCI registries, inherited from sget-go
(I'm probably missing things, please add anything else!)
Folks in the call seemed pretty aligned that a more secure curl alternative was useful, and that we have some ideas we want to explore and build out, and that this is an agreeable path toward that future.
@lukehinds we wanted to get your feedback on the plan before we started clicking buttons. WDYT?
LGTM!
- for now, keep cosign's
cmd/sget
as a wrapper around the importedsigstore/sget
codebase, and eventually remove it after any package managers who use that code have pointed to the "real" sget repo.
This one actually might be too hard to do without circular imports, happy to punt on that one if it complicates stuff too much.
- create a new
github.com/sigstore/sget
repo and copy cosign'scmd/sget
code into the new repo
I think there's a git filter-tree
thing that works for this so we don't lose history/authorship too.
This one actually might be too hard to do without circular imports, happy to punt on that one if it complicates stuff too much.
That sgtm too. π If we don't keep it as a wrapper we just have to coordinate with packagers who expect to find it there.
I think there's a
git filter-tree
thing that works for this so we don't lose history/authorship too.
Sounds like you're volunteering π
Sounds like you're volunteering π
Why not! Let's give it a try.
From @lukehinds in Slack:
@Dan Luhring
sounds ok to me. I donβt quite get how URL endpoints, will be used, but figure you plan to work that out later?
Sounds like we can start clicking buttons π
Thanks everyone!
Closing as outdated, as sget has been archived.
Cc @imjasonh