Roadmap update for TUF support

pypi / warehouse

The Python Package Index

https://pypi.org

Apache License 2.0

3.59k stars 967 forks source link

Roadmap update for TUF support #5247

Closed LucidOne closed 2 years ago

LucidOne commented 5 years ago

Is it possible to get an update on the development roadmap about when TUF or other encryption support might be deployed? Thanks!

Also, it appears that tomorrow will be 6 years since January 5, 2013.

brainwane commented 4 years ago

@jku has some related work people might want to give feedback on, in pip: https://github.com/pypa/pip/issues/8585 and https://github.com/pypa/pip/pull/9041 .

brainwane commented 3 years ago

I see #7488 (comment) mentions a few blockers that people are currently working on ("some bugs in the TUF reference implementation, namely missing roledb state when reloading the repository").

theupdateframework/tuf#574

theupdateframework/tuf#1045

theupdateframework/tuf#1048

A few people are working on those, including @sechkova and @lukpueh and @trishankatdatadog. I'm sure they would welcome help.

I believe all of this is still true except that theupdateframework/tuf#1045 is closed.

joshuagl commented 3 years ago

theupdateframework/tuf#1048

We addressed all of the blockers for Warehouse integration of TUF mentioned above. The remaining, recently filed, issue in TUF is the addition of an abstract signing interface to support the use of signing keys stored in Hashicorp Vault.

That work is being discussed in https://github.com/theupdateframework/tuf/issues/1263

brainwane commented 3 years ago

I'm having trouble following some of the twists and turns in the linked issues and pull requests, so please forgive my ignorance -- what is left in order to finalize TUF support on PyPI? Just theupdateframework/tuf#574 ?

(And then attention ought to move to pypa/pip#8585 to finish up the pip side, I believe.)

woodruffw commented 3 years ago

On the TUF side, abstract signer support is still needed. https://github.com/secure-systems-lab/securesystemslib/pull/319 added it to SSLib, but I don't believe that work's been integrated into TUF itself yet. Once it is, I'll be able to continue work on the various Vault interfaces that Warehouse will use to sign metadata.

joshuagl commented 3 years ago

On the TUF side, abstract signer support is still needed. secure-systems-lab/securesystemslib#319 added it to SSLib, but I don't believe that work's been integrated into TUF itself yet.

Correct, though there's a PR which I'm planning to review next week https://github.com/theupdateframework/tuf/pull/1272

westurner commented 3 years ago

It may not be necessary, but is there a milestone or a project board to collect the issues for this epic?

"Package signing & detection/verification" says "78%" complete, but milestones can't include issues from other repos? https://github.com/pypa/warehouse/milestone/16

Project boards can reference issues from multiple repos. https://github.com/pypa/warehouse/projects

It's not clear who would create and update a GH project board if even necessary for these issues

woodruffw commented 3 years ago

I think a project board would certainly help! I only have triage permissions on this repo, so @brainwane or someone else with more permissions might need to either grant me access or do it.

brainwane commented 3 years ago

sorry, I don't have time to look into this - @ewdurbin could you see about giving Will project board permissions for this repo? Thanks.

trishankatdatadog commented 3 years ago

Speaking of which, do we have updates about the integration? Have not heard updates in a while...

Cc @joshuagl @mnm678

abitrolly commented 3 years ago

Sorry for joining late at the party. I tried to understand how TUF compares to blockchain protection mechanisms against take over and tampering, and to me TUF claims seem misleading.

First TUF main page at https://theupdateframework.io/ claims it provides protection from repo take over.

The Update Framework (TUF) helps developers maintain the security of software update systems, providing protection even against attackers that compromise the repository or signing keys.

And then in https://theupdateframework.io/overview/#how-does-tuf-secure-updates it says this.

TUF identifies the updates, downloads them, and checks them against the metadata that it also downloads from the repository.

In the blockhain world that means that attacker can rehash the content of the repo, and trick clients that the signed content is legit, because clients don't even a copy of Merkle Tree hash to validate the repo at any point in history. How TUF is protects from that? Could someone explain it like I am five?

If the security (by the spec) is provided by offline keys and out-of-band keys distribution, then I don't see how that security can be implemented, or if it worths the complication. For example, https://fwupd.org/ distributes firmware updates for hardware on Linux, is simple and secure without TUF. If analogy with the blockchain is hard, this can be used as an alternative baseline.

JustinCappos commented 3 years ago

How this TUF protection is supposed to work? Could someone explain it like I am five?

@Marina Moore @.***> has put together a helpful set of blog posts ( https://ssl.engineering.nyu.edu/blog/ ) that use Santa Claus and Calvinball (from Calvin and Hobbes) as examples. Let us know if this helps. :)

abitrolly commented 3 years ago

@JustinCappos I am afraid that a distraction for 5 years olds, not an explanation really. :D

mnm678 commented 3 years ago

In the blockhain world that means that attacker can rehash the content of the repo, and trick clients that the signed content is legit, because clients don't even a copy of Merkle Tree hash to validate the repo at any point in history. How TUF is protects from that? Could someone explain it like I am five?

TUF and blockchains are based on different threat models. A blockchain uses decentralized nodes so that an attacker would have to compromise a lot of these nodes to gain control. However, it's not always practical to have a network of trusted nodes for software distribution, and it takes a lot of computation to do the proof-of-work necessary to add new items to a blockchain. TUF instead takes the existing package manager approach, and uses offline keys, revocation, and pinned keys to ensure that a compromise of the repository can be detected and recovered from. Using the blockchain analogy, TUF uses pinned root keys instead of a Merkle Tree hash to validate the state of the repository. This has the advantage that the pinned root keys remain valid through updates to the repository.

If the security (by the spec) is provided by offline keys and out-of-band keys distribution, then I don't see how that security can be implemented, or if it worths the complication. For example, https://fwupd.org/ distributes firmware updates for hardware on Linux, is simple and secure without TUF. If analogy with the blockchain is hard, this can be used as an alternative baseline.

This model has already been mostly adopted by PyPI, as well as many others, so it is certainly possible to implement. @trishankatdatadog can provide more insight into deploying TUF in production.

I don't know the details about https://fwupd.org/ specifically, but numerous supply chain security compromises of production systems occur because of a repository or key compromise. TUF mitigates these risks through the use of offline keys, threshold delegations, and namespacing.

JustinCappos commented 3 years ago

Also a blockchain doesn't deal with the problems of how do you figure out what to put on there and who can change it if it is wrong / stale / keys are compromised. TUF handles those cases.

On Tue, Aug 31, 2021 at 9:58 PM Marina Moore @.***> wrote:

In the blockhain world that means that attacker can rehash the content of the repo, and trick clients that the signed content is legit, because clients don't even a copy of Merkle Tree hash to validate the repo at any point in history. How TUF is protects from that? Could someone explain it like I am five?

TUF and blockchains are based on different threat models. A blockchain uses decentralized nodes so that an attacker would have to compromise a lot of these nodes to gain control. However, it's not always practical to have a network of trusted nodes for software distribution, and it takes a lot of computation to do the proof-of-work necessary to add new items to a blockchain. TUF instead takes the existing package manager approach, and uses offline keys, revocation, and pinned keys to ensure that a compromise of the repository can be detected and recovered from. Using the blockchain analogy, TUF uses pinned root keys instead of a Merkle Tree hash to validate the state of the repository. This has the advantage that the pinned root keys remain valid through updates to the repository.

If the security (by the spec) is provided by offline keys and out-of-band keys distribution, then I don't see how that security can be implemented, or if it worths the complication. For example, https://fwupd.org/ distributes firmware updates for hardware on Linux, is simple and secure without TUF. If analogy with the blockchain is hard, this can be used as an alternative baseline.

This model has already been mostly adopted by PyPI https://pyfound.blogspot.com/2020/10/key-generation-and-signing-ceremony-for.html, as well as many others https://theupdateframework.io/adoptions/, so it is certainly possible to implement. @trishankatdatadog https://github.com/trishankatdatadog can provide more insight into deploying TUF in production.

I don't know the details about https://fwupd.org/ specifically, but numerous supply chain security compromises https://github.com/cncf/tag-security/tree/main/supply-chain-security/compromises of production systems occur because of a repository or key compromise. TUF mitigates these risks through the use of offline keys, threshold delegations, and namespacing.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pypa/warehouse/issues/5247#issuecomment-909263649, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGROD6N5FBEQEUXMELTYMTT7TNXFANCNFSM4GNHO6PQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

trishankatdatadog commented 3 years ago

Also a blockchain doesn't deal with the problems of how do you figure out what to put on there and who can change it if it is wrong / stale / keys are compromised. TUF handles those cases.

I agree. I have no idea why TUF is being compared to a blockchain without the reader doing their due research.

If you must, see this article we wrote comparing and contrasting a centralized "blockchain" (transparent/tamper-evident logs) to TUF, and why you probably want to use both.

abitrolly commented 3 years ago

@mnm678 first, thanks for the explanation. Some nerdy guys like me are completely senseless to people when it comes to "defending the truth". ) I try not to criticize, but when I fail, please forgive me.

However, it's not always practical to have a network of trusted nodes for software distribution,

That's an mistake no.1 (hope you don't mind the terminology, but I don't know another word). Nodes in blockchain are not trusted. They follow the consensus rules. Good nodes do not listen to those who do not follow the consensus. Validating the consensus in that every node does when receiving the block. This way you have near real-time sync of package info and threat detection.

and it takes a lot of computation to do the proof-of-work necessary to add new items to a blockchain.

Mistake no.2 (again I don don't blame anyone - it took me several years to separate blockchain technology from blockchain hype). Proof-of-work is a consensus algorithm for ledgers (accounting books) which designed to solve double spending problem. PyPI is not a ledger, so it is totally irrelevant here. Blockchain is a signed chain of signed blocks. In case of PyPI, one block can be just one package data. The agreement, who can add the blocks is the consensus. "Every user with an account in PyPI can add block" may be a valid rule. "Only blocks that are signed with offline keys" may be a valid rule (although this can be extended to "keys that are signed by offline keys"). "Only users who have the most balance" is not the valid rule for PyPI, but it is another consensus for public ledgers called proof-of-stake.

TUF instead takes the existing package manager approach, and uses offline keys, revocation, and pinned keys to ensure that a compromise of the repository can be detected and recovered from. Using the blockchain analogy, TUF uses pinned root keys instead of a Merkle Tree hash to validate the state of the repository. This has the advantage that the pinned root keys remain valid through updates to the repository.

How packagers sign their packages if TUF private keys are offline?

What are pinned keys? My 5 years old have just read from Wikipedia that public keys pinning for HTTP was considered deprecated, and my search query for pinned keys show a lot of articles that do not recommend this technique.

Unfortunately, without understanding if TUF pinned keys are similar to HTTP pinning keys, I can not comment on if they can really replace Merkle Tree in validating the current state of repository. The state in Merkle Tree not only covers specific package, it coverts the state of all packages at the moment it is generated. If the pinned key signs the state of repository, then it brings another question.

Who owns pinned keys? (package maintainer, PyPI admin, TUF admin)

This model has already been mostly adopted by PyPI, as well as many others, so it is certainly possible to implement.

This doesn't answer the original question - how specifically to sign PyPI packages with offline keys, and which out-of-band channels PyPI users should use for keys distribution. A simple example for a monkey who wants to upload package to PyPI in the most secure manner would do.

I don't know the details about https://fwupd.org/ specifically, but numerous supply chain security compromises of production systems occur because of a repository or key compromise. TUF mitigates these risks through the use of offline keys, threshold delegations, and namespacing.

I haven't found https://fwupd.org/ in the list, so it is hard for to me to accept this argument against it. If TUF security is provided by offline keys, so does the https://fwupd.org/ but without the complications imposed by TUF. https://fwupd.org/ distributes packages signed/encrypted by vendor key (TUF offline keys), and BIOS and other hardware (according to UEFI spec) will not update itself if the signature doesn't match hardcoded public key (TUF out-of-band channel). So the security of TUF and https://fwupd.org/ are equivalent.

JustinCappos commented 3 years ago

I haven't found https://fwupd.org/ in the list, so it is hard for to me to accept this argument against it. If TUF security is provided by offline keys, so does the https://fwupd.org/ but without the complications imposed by TUF. https://fwupd.org/ distributes packages signed/encrypted by vendor key (TUF offline keys), and BIOS and other hardware (according to UEFI spec) will not update itself if the signature doesn't match hardcoded public key (TUF out-of-band channel). So the security of TUF and https://fwupd.org/ are equivalent.

From the https://fwupd.org/lvfs/docs/developers site under "Is updating firmware secure?"

In both the LVFS and fwupd, GPG crypto is being performed using GnuPG and PKCS#7 crypto is using GnuTLS. The fwupd daemon has no network access and only acts as the mechanism for clients using D-DBus and PolicyKit. Some devices also have additional hardware signature verification schemes implemented by the device manufacturer.

The LVFS and fwupd codebases have had several independent security audits. The LVFS has a huge number of tests run for each commit https://travis-ci.org/hughsie/lvfs-website, and fwupd has a comprehensive test suite https://travis-ci.org/hughsie/fwupd, and is regularly scanned using both clang and Coverity https://scan.coverity.com/projects/10744.

The threat model implied here is that they sign something and are careful with the key. There is no talk about how they handle key revocation, etc.

TUF focuses on dealing with compromises. Not only just keys, but of servers and other parts of the infrastructure. Of course, we've had audits too (which you can find linked on the project site), but the system is designed to resist and securely recover from a compromise of keys, servers, etc. So the threat model and goals are very different. (You can find a lot more about TUF's goals by reading this page, especially the Mitigating Key Risk portion https://theupdateframework.io/security/ )

The website also has a lot of technical papers that describe the security differences in much greater detail over solutions that use a single key for signing, such as the project you mentioned.

How packagers sign their packages if TUF private keys are offline?

There are different keys in TUF. Some keys (like the root keys and some targets keys) are offline. Others are held by the developers.

To try to give the five year old version. If you have a small project on your own, you have your key for your project. If you have a group project, you can choose if one person has the key, if multiple people have to use keys, etc.

Just so you're not confused about pinning, this isn't HTTP pinning. The reasons why HTTP pinning is deprecated don't make sense in this context because you don't have hundreds of potentially valid roots of trust (trusted CAs) and have to deal with the problems with having something incorrectly pinned to the incorrect version. PyPI's targets role handles this namespacing unambiguously.

On Tue, Aug 31, 2021 at 11:02 PM Anatoli Babenia @.***> wrote:

@mnm678 https://github.com/mnm678 first, thanks for the explanation. Some nerdy guys like me are completely senseless to people when it comes to "defending the truth". ) I try not to criticize, but when I fail, please forgive me.

However, it's not always practical to have a network of trusted nodes for software distribution,

That's an mistake no.1 (hope you don't mind the terminology, but I don't know another word). Nodes in blockchain are not trusted. They follow the consensus rules. Good nodes do not listen to those who do not follow the consensus. Validating the consensus in that every node does when receiving the block. This way you have near real-time sync of package info and threat detection.

and it takes a lot of computation to do the proof-of-work necessary to add new items to a blockchain.

Mistake no.2 (again I don don't blame anyone - it took me several years to separate blockchain technology from blockchain hype). Proof-of-work is a consensus algorithm for ledgers (accounting books) which designed to solve double spending problem. PyPI is not a ledger, so it is totally irrelevant here. Blockchain is a signed chain of signed blocks. In case of PyPI, one block can be just one package data. The agreement, who can add the blocks is the consensus. "Every user with an account in PyPI can add block" may be a valid rule. "Only blocks that are signed with offline keys" may be a valid rule (although this can be extended to "keys that are signed by offline keys"). "Only users who have the most balance" is not the valid rule for PyPI, but it is another consensus for public ledgers called proof-of-stake.

TUF instead takes the existing package manager approach, and uses offline keys, revocation, and pinned keys to ensure that a compromise of the repository can be detected and recovered from. Using the blockchain analogy, TUF uses pinned root keys instead of a Merkle Tree hash to validate the state of the repository. This has the advantage that the pinned root keys remain valid through updates to the repository.

How packagers sign their packages if TUF private keys are offline?

What are pinned keys? My 5 years old have just read from Wikipedia https://en.wikipedia.org/wiki/HTTP_Public_Key_Pinning that public keys pinning for HTTP was considered deprecated, and my search query for pinned keys https://www.google.com/search?client=firefox-b-d&q=pinned+keys show a lot of articles that do not recommend this technique.

Unfortunately, without understanding if TUF pinned keys are similar to HTTP pinning keys, I can not comment on if they can really replace Merkle Tree in validating the current state of repository. The state in Merkle Tree not only covers specific package, it coverts the state of all packages at the moment it is generated. If the pinned key signs the state of repository, then it brings another question.

Who owns pinned keys? (package maintainer, PyPI admin, TUF admin)

This model has already been mostly adopted by PyPI, as well as many others, so it is certainly possible to implement.

This doesn't answer the original question - how specifically to sign PyPI packages with offline keys, and which out-of-band channels PyPI users should use for keys distribution. A simple example for a monkey who wants to upload package to PyPI in the most secure manner would do.

I don't know the details about https://fwupd.org/ specifically, but numerous supply chain security compromises of production systems occur because of a repository or key compromise. TUF mitigates these risks through the use of offline keys, threshold delegations, and namespacing.

I haven't found https://fwupd.org/ in the list, so it is hard for to me to accept this argument against it. If TUF security is provided by offline keys, so does the https://fwupd.org/ but without the complications imposed by TUF. https://fwupd.org/ distributes packages signed/encrypted by vendor key (TUF offline keys), and BIOS and other hardware (according to UEFI spec) will not update itself if the signature doesn't match hardcoded public key (TUF out-of-band channel). So the security of TUF and https://fwupd.org/ are equivalent.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pypa/warehouse/issues/5247#issuecomment-909320283, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGRODZNNGUXZ645HQTIKALT7TVHBANCNFSM4GNHO6PQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

abitrolly commented 3 years ago

Also a blockchain doesn't deal with the problems of how do you figure out what to put on there and who can change it if it is wrong / stale / keys are compromised. TUF handles those cases.

@JustinCappos that's a valid point, and a tough problem. if anyone can explain in those childish terms how the handling is done, that would clear up my doubts. Right now I understand this as if offline keys are lost, everything is lost. The same way if private key on the blockchain is lost. On the blockchain the problem is solved with multisignature keys, so that you need 3 out of 5 signatures to make multisig valid. There are also extensions that allow to migrate valid multisig to another multisig with no missing keys.

If you must, see this article we wrote comparing and contrasting a centralized "blockchain" (transparent/tamper-evident logs) to TUF, and why you probably want to use both.

@trishankatdatadog yes, I've read it (after asking), and I like TL very much. I think it is a way to go. So far it seemd that TUF adds a very high level of complexity even if TL + TUF is the most secure. It is hard to explain, unlike blockchain concepts, which many people have already learned. It also hard to me to wrap my head how the TUF absence of immutable history and third-party auditing can provide security in the world, where the state of dependency tree is often more important than the state of packages you own.

While I don't think that TUF is the way to go, I think it may have some good ideas on how to manage signing keys, so instead of following TUF or using combined TF+TUF, there may be a slimmed down version of security framework, that reuses components from both and also leverages some best practices from blockchain technology (like real-time sync, notifications and caching).

ewdurbin commented 3 years ago

Hello @abitrolly! I appreciate your engagement and concern with the security of PyPI shown in this discussion.

However, there are multiple points in the recent conversation where you have chosen to belittle or dismiss things as "childish" and comparing efforts of those involved to 5 year olds. This isn't very respectful of the effort and time that people have put into this work.

I'd like to ask you to review the PSF Code of Conduct, which this repository and discussion adheres, before further disrespectful behavior becomes an issue.

trishankatdatadog commented 3 years ago

While I don't think that TUF is the way to go, I think it may have some good ideas on how to manage signing keys, so instead of following TUF or using combined TF+TUF, there may be a slimmed down version of security framework, that reuses components from both and also leverages some best practices from blockchain technology (like real-time sync, notifications and caching).

Consider that if something looks complicated, there might be good reasons for it, especially if it was designed with a threat model with nation-state attackers in mind. Feel free to use TLs all you like, but the PyPA consensus has been for TUF, with the community free to record TUF metadata on TLs if they wish, thus getting the best of both worlds.

abitrolly commented 3 years ago

@ewdurbin all I wanted is to receive a layperson-friendly explanation as it happens in https://www.reddit.com/r/explainlikeimfive/ which I've subscribed to. I apologize that I haven't referenced it in the first place. Does that clarify that the phrase "explain me like I am five" is not done to belittle or dismiss things as "childish", or offend those who put many efforts in developing and promoting TUF?

I'd like to ask you to review the PSF Code of Conduct, which this repository and discussion adheres, before further disrespectful behavior becomes an issue.

I acknowledge time and effort that people put into developing TUF and trying to include it into Python distribution index. If the critics of TUF framework itself is seen as disrespectful behavior, then it will be better for me to leave the people to their business.

westurner commented 3 years ago

Thanks for your feedback.

IMHO, Sigstore should be (1) at least rooted in a trustless blockchain; and (2) using ld-proofs and W3C CCG Cryptographic Signature Suite URIs for future-proofing. That aside, how can Sigstore and TUF work together?

Is there a good ELI5 graphic of the PyPI TUF package build and release workflow, and maybe also a complete sequence diagram? https://en.wikipedia.org/wiki/Sequence_diagram

https://www.sigstore.dev Sigstore architecture summary

JustinCappos commented 3 years ago

In a rush, but quickly wanted to point out that Sigstore uses TUF...

https://dlorenc.medium.com/using-the-update-framework-in-sigstore-dc393cfe6b52

On Fri, Sep 3, 2021 at 11:32 PM Wes Turner @.***> wrote:

Thanks for your feedback.

IMHO, Sigstore should be (1) at least rooted in a trustless blockchain; and (2) using ld-proofs and W3C Signature Suite URIs. That aside, how can Sigstore and TUF work together?

Is there a good ELI5 graphic of the PyPI TUF package build and release workflow, and maybe also a complete sequence diagram? https://en.wikipedia.org/wiki/Sequence_diagram

![Sigstore architecture summary] (https://www.sigstore.dev/img/system_architecture_summary-01.svg)

On Fri, Sep 3, 2021, 04:42 Anatoli Babenia @.***> wrote:

@ewdurbin https://github.com/ewdurbin all I wanted is to receive a layperson-friendly explanation as it happens in https://www.reddit.com/r/explainlikeimfive/ which I've subscribed to. I apologize that I haven't referenced it in the first place. Does that clarify that the phrase "explain me like I am five" is not done to belittle or dismiss things as "childish", or offend those who put many efforts in developing and promoting TUF?

I'd like to ask you to review the PSF Code of Conduct, which this repository and discussion adheres, before further disrespectful behavior becomes an issue.

I acknowledge time and effort that people put into developing TUF and trying to include it into Python distribution index. If the critics of TUF framework itself is seen as disrespectful behavior, then it will be better for me to leave the people to their business.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pypa/warehouse/issues/5247#issuecomment-912368404, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAAMNSZ2CYOUPZMLY3R6GG3UACC6DANCNFSM4GNHO6PQ

. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pypa/warehouse/issues/5247#issuecomment-912624534, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGROD3S22WGH66XULSHY7DUADS7JANCNFSM4GNHO6PQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

trishankatdatadog commented 3 years ago

IMHO, Sigstore should be (1) at least rooted in a trustless blockchain; and (2) using ld-proofs and W3C Signature Suite URIs. That aside, how can Sigstore and TUF work together?

I'm not: (1) sure that there is such a thing as a "trustless" blockchain, and (2) familiar with these technologies. However, Marina and I wrote a blog post about how you can combine TUF and transparent/tamper-evident logs such as sigstore. TLDR: you can publish TUF timestamp metadata on sigstore. My upcoming talk at SupplyChainSecurityCon will discuss how the Datadog Agent integrations is the first transparent, compromise-resilient software publication pipeline in the world.

Is there a good ELI5 graphic of the PyPI TUF package build and release workflow, and maybe also a complete sequence diagram?

I don't think there is one right now, but perhaps you could help us make one using the descriptions from PEPs 458 and 480?

trishankatdatadog commented 3 years ago

In a rush, but quickly wanted to point out that Sigstore uses TUF...

Yes. This will be a separate TUF repository for open source projects, distinct from the one for PyPI right now. However, PyPI can publish its own TUF timestamps to sigstore.

westurner commented 3 years ago

I'm not: (1) sure that there is such a thing as a "trustless" blockchain,

Some criteria for "trustless blockchains" (DLT Distributed Ledger Technology) in the Software Supply Chain [Security] domain:

Nobody has root.
Everyone has backup responsibilities, if they want.
Everyone can check that the online data matches the online backups.
Only users with package release permissions can create a new SoftwareRelease record for that project
- Does this imply that a permissioned chain is necessary: wherein certain parties gatekeep which keys can sign for which package; is a permissioned chain necessary for Role Delegation / Authorization?
- Otherwise, if a package author lost the initial package_name 'token' (?), e.g. PyPI would have no way to recover that initial key and the only option would be to change the package name: a DID-like package name could be "burnt" and thus unrecoverable. Does TUF have these risks, as well?

and (2) familiar with these technologies.

Do they implement the same primitives with different or "yet another" data representation standards?

https://w3c-ccg.github.io/ld-proofs/
Cryptographic proofs enable functionality that is extremely useful to implementors of distributed systems. For example, proofs can be used for purposes such as:
- Make statements that can be shared without loss of trust, because their authorship can be verified by a third party, for example as part of Verifiable Credentials [VC-DATA-MODEL] or social media posts.
- Authenticate as an entity identified by a particular identifier, for example, as the subject identified by a Decentralized Identifier (DID) [DID-CORE].
- Delegate authorization for actions in a remote execution environment, via mechanisms such as Authorization Capabilities [ZCAP].
- Agree to contracts where the agreement can be verified by another party.
- Additionally, many proofs that are based on cryptographic signatures provide the benefit of integrity protection, making documents and data tamper-evident. The term Linked Data is used to describe a recommended best practice for exposing, sharing, and connecting information on the Web using standards, such as URLs, to identify things and their properties. When information is presented as Linked Data, other related information can be easily discovered and new information can be easily linked to it. Linked Data is extensible in a decentralized way, greatly reducing barriers to large scale integration.
With the increase in usage of Linked Data for a variety of applications, there is a need to be able to verify the authenticity and integrity of Linked Data documents. This specification adds authentication and integrity protection to linked data documents through the use of mathematical proofs without sacrificing Linked Data features such as extensibility and composability.

https://w3c-ccg.github.io/ld-proofs/#linked-data-signatures :

{
"@context": [
   {"title": "https://schema.org#title"},
   "https://w3id.org/security/suites/ed25519-2020/v1"
 ],
 "title": "Hello world!",
 "proof": {
   "type": "Ed25519Signature2020",
   "created": "2020-11-05T19:23:24Z",
   "verificationMethod": "https://ldi.example/issuer#z6MkjLrk3gKS2nnkeWcmcxiZPGskmesDpuwRBorgHxUXfxnG",
   "proofPurpose": "assertionMethod",
   "proofValue": "z4oey5q2M3XKaxup3tmzN4DRFTLVqpLMweBrSxMY2xHX5XTYVQeVbY8nQAVHMrXFkXJpmEcqdoDwLWxaqA3Q1geV6"
  }
}

From https://w3c-ccg.github.io/ld-proofs/#dfn-proof-algorithm :

Creating New Proof Types A Linked Data Proof is designed to be easy to use by developers and therefore strives to minimize the amount of information one has to remember to generate a proof. Often, just the cryptographic suite name (e.g. Ed25519Signature2018) is required from developers to initiate the creation of a proof. These cryptographic suites are often created or reviewed by people that have the requisite cryptographic training to ensure that safe combinations of cryptographic primitives are used. [...] A complete example of a proof type is shown in the next example:

EXAMPLE 7
{
 "id": "https://w3id.org/security#Ed25519Signature2020",
 "type": "Ed25519VerificationKey2020",
 "canonicalizationAlgorithm": "https://w3id.org/security#URDNA2015",
 "digestAlgorithm": "https://www.ietf.org/assignments/jwa-parameters#SHA256",
 "signatureAlgorithm": "https://w3id.org/security#ed25519"
}

https://www.w3.org/TR/did-core/ :

Decentralized identifiers (DIDs) are a new type of identifier that enables verifiable, decentralized digital identity. A DID refers to any subject (e.g., a person, organization, thing, data model, abstract entity, etc.) as determined by the controller of the DID. In contrast to typical, federated identifiers, DIDs have been designed so that they may be decoupled from centralized registries, identity providers, and certificate authorities. Specifically, while other parties might be used to help enable the discovery of information related to a DID, the design enables the controller of a DID to prove control over it without requiring permission from any other party. DIDs are URIs that associate a DID subject with a DID document allowing trustable interactions associated with that subject.

Each DID document can express cryptographic material, verification methods, or services, which provide a set of mechanisms enabling a DID controller to prove control of the DID. Services enable trusted interactions associated with the DID subject. A DID might provide the means to return the DID subject itself, if the DID subject is an information resource such as a data model.

The Blockcerts service implements W3C ld-proofs, DIDs, https://www.blockcerts.org/guide/ :

Blockcerts is an open standard for building apps that issue and verify blockchain-based official records. These may include certificates for civic records, academic credentials, professional licenses, workforce development, and more.

Blockcerts consists of open-source libraries, tools, and mobile apps enabling a decentralized, standards-based, recipient-centric ecosystem, enabling trustless verification through blockchain technologies.

Blockcerts uses and encourages consolidation on open standards. Blockcerts is committed to self-sovereign identity of all participants, and enabling recipient control of their claims through easy-to-use tools such as the certificate wallet (mobile app). Blockcerts is also committed to availability of credentials, without single points of failure.

These open-source repos may be utilized by other research projects and commercial developers. It contains components for creating, issuing, viewing, and verifying certificates across any blockchain. These components form all the parts needed for a complete ecosystem.

Blockcerts how it works

https://github.com/blockchain-certificates

Nobody has root.

"Key generation and signing ceremony for PyPI [TUF]" https://pyfound.blogspot.com/2020/10/key-generation-and-signing-ceremony-for.html
- https://github.com/psf/psf-tuf-runbook

di commented 2 years ago

I've published a roadmap for the PEP 458 rollout in https://github.com/pypa/warehouse/issues/10672. Please note that this issue is for PEP 458-related development work only, and is not a place to discuss TUF alternatives, issues with TUF as a framework, or ask for explanations on how TUF works.