seedwing-io / seedwing-proxy

Policy-enforcing Artifact Proxy
Apache License 2.0
2 stars 3 forks source link

Create minimal integration test and/or example working with crates.io #2

Open lkatalin opened 1 year ago

lkatalin commented 1 year ago

This issue will track what needs to be added for a minimal example of the proxy working with crates.io to check a crate and allow or deny it based on a sigstore parameter (ex. a license type, a valid signature) specified by a (dummy) policy. I would like to solicit input on the plans for / implementations of these parts (maybe I missed something that is already working or maybe my conception of some of the goals are completely off!).

As a subset of the larger proxy architecture, it would cover these component interactions:

IMG_20221128_172343042~2

Implementation is needed (or possibly already present) for:

Thoughts on the above are greatly appreciated. @bobmcwhirter

danbev commented 1 year ago
  • I am trying to figure out what this would look like. Are the license files (here, here) from opa-client the best example? How are these created?

The wasm module was created by doctrine, and there is a test target if the policies need to be modified. We can copy/move the Makefile into this repository if needed.

bobmcwhirter commented 1 year ago

So yah, the existing code can be ignored as appropriate. The current sigstore integration was me just verifying that I was able to sign a crate outside of this workflow, and then SHA it and find the sig(s) upon fetch.

I do think the sigstore sig-fetching should probably be on the policy-engine side of things.

With regards to integration with cargo, I think there's two routes:

1) As a transparent-ish HTTP proxy

[http]
debug = false               # HTTP debugging
proxy = "host:port"         # HTTP proxy in libcurl format

2) By setting the default (or other) repos to the proxy URL

[registry]
default = "…"        # name of the default registry
token = "…"          # authentication token for crates.io

both via https://doc.rust-lang.org/cargo/reference/config.html and probably set via a .cargo/config.toml either under the CI's $HOME or within the project dir itself.

The method I was aiming for would mean not wrapping or PRing changes to upstream cargo.

Rather, it's more "content shaping" and providing a network-based filter in front of cargo's normal HTTP operations.

How we deal with git crates? No idea!

lkatalin commented 1 year ago

Thanks for the replies, @bobmcwhirter @danbev . As is usually the case, your answers have germinated more questions.

The current sigstore integration was me just verifying that I was able to sign a crate outside of this workflow, and then SHA it and find the sig(s) upon fetch.

Sounds like a lot of the functionality we need is already there, then, just needing to be moved to the policy engine and then the results used to allow or deny, perhaps under evaluate()?

With regards to integration with cargo, I think there's two routes ...

Thanks for outlining the two possible routes for cargo integration. I need to study the cargo config a bit to familiarize, and then form more questions. Is the in-toto work following one of the two models described? Or does it modify cargo (I see some cargo r commands)? Or is this work completely separate from the seedwing proxy and so it doesn't matter?

How we deal with git crates? No idea!

I see git dependencies mentioned here, is this speaking to that problem or am I conflating two things?

I also have some higher-level questions about the roles of the different repos in play and OPA, and around the policy format itself.

bobmcwhirter commented 1 year ago

wrt git dependencies, I'm wondering how we can intercept and apply policy. I'm unfamiliar with how cargo actually fetches dependencies from a git URL. Is that something we can proxy? Dunno!

seedwing-policy is my nascent attempt to do something a bit better than OPA. I find Rego and OPA a bit.. tedious and overgeneralized. It may go nowhere, but if it works, I'd like -proxy to be able to at least alternatively use -policy.

The doctrine repo was intended, kinda sorta, to hold authored policies. The idea being that folks could meld together some centrally-authored policies, along with their local organization exceptions or adaptations.

No reason more than a single person needs to write a policy defining "OSI-approved licenses" etc. A policy library.

bobmcwhirter commented 1 year ago

I also wonder if the [patch] section of cargo config can be used, at least for git dependencies.

But are we creating too much busywork for users to integrate? If we can aim for minimally-invasive to a build, that'd be best, I suspect.

lkatalin commented 1 year ago

Okay, thanks - so seedwing-policy is a potential replacement for opa-client. doctrine is a policy library. With policies are we trying to focus on licenses only atm, or is it equally viable to have a policy stating something like "crate signature must be present in rekor"? Is there a minimal policy in doctrine saying something like this already? Admittedly I have trouble understanding rego atm as I have not seen it before, but it seems mostly license-focused from what I have looked through so far.

lkatalin commented 1 year ago

If we can aim for minimally-invasive to a build, that'd be best, I suspect.

:+1:

bobmcwhirter commented 1 year ago

Yah -policy would be a replacement for OPA. And then we'd need a seedwing-policy-client.

And yes, the policy library could contain a policy that says "signatures from foo@bar.com must exist in rekor"

The input to the policy engine would be (incomplete)

Policy engine then scrambles that up, queries sigstore, queries $whereever-other-data-is-kept, and can decide if the requested artifact is allowed or rejected.

bobmcwhirter commented 1 year ago

And really doctrine was me just learning how to write policies with Rego/OPA, and deciding that license compatibility might be the easiest thing to reach to start. I have no idea how to teach OPA to query rekor. Maybe we can. Maybe -policy just knows how.

danbev commented 1 year ago

Is the in-toto work following one of the two models described? Or does it modify cargo (I see some cargo r commands)? Or is this work completely separate from the seedwing proxy and so it doesn't matter

This is completely separate and was pursued to figure out how a source distributed Rust project could potentially be signed with in-toto and sigstore, and then how it might be verified using a command line tool or cargo extension. This was done because I was talking/thinking about how things might work but it was more like guessing. Having something that actually works helped iron out things and allowed us to fix some issues around this.

How does cargo-verify.rs relate to a policy evaluation? Are these doing any overlapping or complementary things?

They are currently unrelated, though perhaps if cargo-verify (or some other name) does become something it could utilize the policy work to verify more than just signatures/layouts.

Why are the policies wasm modules?

Sorry about that, I did have a motivation note in a different repo but never copied it over. I've added a note about this now. Another motivation was at the time I was not able to find a Rust implementation of OPA and using kubewarden/policy-evaluator made sense to save time until we know if/what will be used eventually.

danbev commented 1 year ago

Would it make sense to stick this section "Motivation for using tools/projects" in one of the repositories?

lkatalin commented 1 year ago

Would it make sense to stick this section "Motivation for using tools/projects" in one of the repositories?

Yes, I think this would be great. We should have such a write-up for all of the repos.

lkatalin commented 1 year ago

So the two routes to a MVE (minimum viable example) seem to be:

License types <-- probably easier?

Rekor signatures <-- probably harder but there is code we can use in download()

Does this sound right-ish?

Update: we are planning on using the new seedwing-policy to create these examples around rekor signatures, and using [source] in the cargo config to send traffic to the proxy.

lkatalin commented 1 year ago

Blocking issues: