MatrixAI / Polykey

Polykey Core Library
https://polykey.com
GNU General Public License v3.0
31 stars 4 forks source link

Blockchain as an Identity Provider - Public Key Single Sign-On #352

Open CMCDragonkai opened 2 years ago

CMCDragonkai commented 2 years ago

Research Hypothesis/Question

How can Polykey integrate asymmetric key cryptography, as used in web3 wallets and Passkey, to streamline identity verification and access management across both decentralized and traditional web environments?

Review of Existing Ideas, Literature, and Prior Work

  1. Asymmetric Keys in Web3 Wallets: Investigate the use of asymmetric keys by various web3 wallets for decentralized authentication, evaluating their implementation, user experience, and security features.
  2. Passkey as Traditional Web Authentication: Examine Passkey technology, which applies asymmetric keys in traditional web environments, to identify potential for integration with blockchain technologies.
  3. Unified Approach to Identity Verification: Analyze how both systems could be merged to create a cohesive identity verification mechanism within Polykey, focusing on interoperability and security.
  4. Comparative Analysis:
    • Investigate how asymmetric keys are utilized across different technologies and platforms to ensure a comprehensive understanding of potential integration points.
    • Review the hmac-secret extension in the CTAP protocol as discussed in Encrypting Data in the Browser Using WebAuthn, which is an analogue to web-based authentication mechanisms outside the blockchain sphere.

Research Conclusion

The goal is to assess whether a unified model using asymmetric keys can serve as a robust, secure foundation for identity verification across Polykey's diverse application areas, bridging the gap between decentralized and conventional web technologies.

Sub-Issues & Sub-PRs Created

Additional Notes

CMCDragonkai commented 2 years ago

The pdf https://assets.ctfassets.net/2ntc334xpx65/42fINJjatOKiG6qsQQAyc0/8b63e552f4cfef313f579b8e9c9154b5/intro-to-ethereum.pdf

Has a tutorial going through how to implement a web3 login flow.

The overall ethereum architecture is interesting as well with geth being used as both the client and server side of an Ethereum node, and then web3 as a separate client library and ethereum wallets being also completely separate applications. Whereas in PK we would want to keep it all in one application bundle.

CMCDragonkai commented 2 years ago

I've been researching this system and the wider implications of metamask, ethereum, blockchain, web3... etc.

Firstly all wallet software are just key management software. They manage the root keypair to the blockchain, and perform RPC requests to a blockchain node (similar to PK client to PK agent where the node is the agent, and wallets are clients) like Ethereum nodes (geth) to perform transactions on the network. This includes smart contract deployments and executions of smart contract functions, and transactions involving non-fungible tokens, and other blockchain assets. The blockchain nodes can also function as wallets, and in that sense, they are "full-node wallets", and avoids any centralised bottleneck of contacting a single node. The nodes are often provided by the wallet developer, or they can be hired from nodes as a service businesses like Infura or Alchemy.

Because wallets are key management software that only manages a single key, that being the root key to the blockchain, they are also holders of "digital identity", that is registered on their respective blockchain.

This enables these wallets to provide a Single Sign-On experience for "web3" apps.

Web3 applications are just web applications that expect users to sign-in with their blockchain identity, and provides services for their blockchain assets. They may combine this with web2 centralised databases or other services. This is a big deal as it opens a disruptive industry of decentralised services that compete with web2 companies like Facebook, Google... etc. Digital sovereignty is ensured by having user's generated content and their data be owned by them in the blockchain rather than being held in centralised privately owned databases. Private data can go into the blockchain as well by encrypting them.

However blockchains suffer from scalability problems, which is really a technical concern. And there are developments right now that aim to improve the scalability of blockchains in a myriad of ways. This include integrating off-chain systems like IPFS (which hosts alot of NFT data), and staking-protocols and extra-layers built on top of foundational blockchains like layer2, layer3... etc. One particular innovation I'm interested is sharding which is commonly comes from the database world, and is something that should be applicable to Polykey. These technical concerns will eventually be solved, right now the political and financial incentives to make use of technology is very strong.

Back to SSO (single sign on). Unlike OAuth2, this kind of SSO is decentralised and is not bound to any single private system. Instead anyone can create an app making use of the the blockchain data. You are not beholden to facebook, linkedin, twitter API policy changes. And blockchains inherently have incentives to develop cross-chain protocols of which startups are working on right now to enable cross-chain assets.

This is why there are a number of apps popping up now where you sign in with metamask. It's a ethereum wallet implemented as a chrome extension, and enables you to sign in with just the blockchain identity. This is alot more successful than previous attempts at this such as persona from mozilla simply because the financial incentive are aligned between users and developers and companies.

SSO has been used to reduce the amount of "password fatigue" that occurs inside enterprises, as their employees have to access many systems. SSO is a factor in our polykey design and Polykey's philosophy is that SSO is a good thing, but passwords and many other kinds of secrets will still exist in complex digital environments, since SSO is just one part of the UI/UX-simplification. But traditional SSO relies on corporate integrations like kerberos, gluu and third party vendors all working together. This kind of web3 SSO bring SSO to the public, the broader society.

Before you had 2 choices, create your own password-auth system (this took work), rely on third party sign in like "Sign in with Facebook" (this had third-party risks). Now you have a 3rd choice, which is likely alot more sustainable, sign in with your blockchain identity. And if your app's core features relies on blockchain assets, then it's a no-brainer to do this. But even if your app doesn't rely on blockchain assets, it can still be useful, and there are many developments to move all sorts of digital activities into blockchain-tracked activities.

The implication for Polykey is that Polykey can also be a wallet. After all, it's already a key management system (and more generic than that atm). We can integrate web3 sign-in protocols in 2 ways:

  1. Gestalt System - currently we already plan to enable gestalt identities cryptolinking centralised identity such as Twitter, Facebook... etc with the PK nodes, this connects up the reputation systems in the world. Why not also cryptolink to user's Ethereum account? It can work a in similar way, using a smart contract to deploy a cryptolink NFT, or just perform a transaction with specialised data. Note that any work done on the blockchain involves a fee that is paid. This is necessary to prevent spam on the blockchain.
  2. Once we have a browser extension, we can make use of the web3 sign-in protocol, and users can use their PK app instead of Metamask to sign-in to web3 apps as well.

In case 1, this can one of our vectors towards decentralised trust, especially the usage of smart contract crypto-link NFT tokens, and its relationship with our sigchain. Similar ideas echoed here https://github.com/freenet/locutus.

Allows users to build up reputation over time based on feedback from those they interact with. Think of the feedback system in services like Uber, but with Locutus it will be entirely decentralized and cryptographically secure. It can be used for things like spam prevention (with IM and email), or fraud prevention (with an online store).

Arbiters are trusted services that can perform tasks and authenticate the results, such as verifying that a contract had a particular state at a given time, or that external blockchains (Bitcoin, Ethereum, Solana etc) contain specific transactions. Trust is achieved through the reputation system.

In case 2, storing blockchain keys into PK makes PK a target as it is basically a hot-wallet "an internet-connected wallet". It's important to realise that multiple PK nodes participating in a gestalt can link up multiple different ethereum accounts. Each seed-phrase (along with optional passphrase) can generate multiple ethereum accounts, as each account is just an address, and each address's root key is generated deterministically by iterating a nonce from the binary seed. See this diagram:

image

However just because the gestalt has linked up the ethereum accounts, it doesn't mean that the PK node you're running has the actual key. Our OAuth2 systems exchange for an access token, not the actual password to the third party system. A PK node has to however keep the key around to work with a given account-address. There may be ways of "dropping privileges" or "privilege bracketing" by holding on a single key to a single address, but not the original seed phrase/binary seed. Users can then manipulate how much privilege they hand over to PK and which PK nodes they are using. There are implications to our vault ops design as these are "abstract operations" performed on top of a vault.

However there are of course mitigation procedures here, and users can create multiple PK identities all linked to each other, but they don't necessarily need to use the same

In both cases, PK would need to be able to call to a blockchain node to do the work (possibly using etherscan's API to query things). We wouldn't expect users to run their own blockchain node. But many wallets like MyCrypto actually allows users to select their own blockchain nodes or custom nodes. We can register PK-default nodes as a service with Infura and Alcehmy. This avoids having to run our own blockchain nodes, but we could also run our own blockchain nodes.

Currently ethereum is the main one in this space, so we'd only focus on this.

Proof of stake, higher-layers, cross chain assets, and token wrapping should be investigated further to understand how our sigchain may interact with blockchain systems.

CMCDragonkai commented 2 years ago

Another example is https://news.ycombinator.com/item?id=31643917 directly from Apple.

CMCDragonkai commented 2 years ago

More generally on the topic of authentication, I recently looked into PAKE and Zero Knowledge proofs.

I think these ideas are generally related to "Zero Trust" https://en.wikipedia.org/wiki/Zero_trust_security_model. Moving towards a world we don't rely on trust that can easily be compromised.

One particular area that was interesting is the usage of passwords. There's always been development that prematurely declared that the password is dead: https://en.wikipedia.org/wiki/Password#%22The_password_is_dead%22, however passwords have an intersection of properties that often makes it the best solution for a given situation.

One problem with passwords that I think should actually be solved but isn't, is the fact that passwords are passed as plaintext during initial sign-up and log-ins for many systems.

Most systems rely on transport-level encryption like TLS so that passwords are not exposed. However as systems are getting more complex, this is actually not enough:

  1. Passwords may be exposed by logging systems that log request metadata - apparently happened to Facebook
  2. Passwords may be re-used, and systems that get compromised end up leaking passwords resulting in second order effects
  3. TLS often gets terminated which leads to plaintext exposure of passwords between microservices: image
  4. Passwords can be exposed to employees, and the moral hazard problems comes into play

It seems like passwords are ripe for https://en.wikipedia.org/wiki/Tokenization_(data_security).

And in fact that's kind of what we already do, by exchanging passwords for session tokens.

But let's see if we can eliminate plaintext passwords entirely at the beginning of the entire interaction, at the sign-up/register and login.

This is where PAKE derivative protocols can be used.

This led me to the realisation that almost all secure systems follow a pattern of starting with asymmetry to resulting in symmetry. Like when files get encrypted, there may be an asymmetric key used to first encrypt a symmetric key, and then the symmetric key is used to decrypt the rest of the file contents. Asymmetric protocol tends to be a chatty back and forth protocol like challenge and response, while symmetric protocols tend to be less chatty and more efficient. So to combine the best of both worlds, you do asymmetry at the beginning and amortise the cost of asymmetric usage over the lifetime of the symmetric usage.

And also the realisation that webauthn and web3 SSO and also this PAKE-derived ideas are all efforts to somewhat replace the password with something that is more asymmetric.

Because the fundamental problem with symmetric systems is replay-attacks. Once the symmetric secret is exposed, the resource is compromised. Asymmetric systems are fundamentally necessary to bootstrap a trusted communication channel over insecure channels. And all systems always starts as an insecure channel. In fact, it's recursive, because even if the underlying system is secure relative to the provider, relative to the end parties communicating, it is not secure because the provider can see the secrets.

So that's how this relates to the metamask, blockchain identity web3 sso stuff.

As a side-note, I asked this question https://crypto.stackexchange.com/q/100645/102416. But the answer I think lies here: https://crypto.stackexchange.com/questions/68750/timestamps-sequence-numbers-and-nonces-for-replay-attack?rq=1

CMCDragonkai commented 2 years ago

I just had a look into how Lido and Rocketpool does their authentication process, and I can see an evolution in this technology.

Firstly wallets like ledger are both hardware and software wallets. So dapps like lido and rocketpool first coded up a direct authentication system to the hardware wallet. This requires working with the web HID (USB) and seems kind of clunky.

Then they supported software connections by directly using applinks. Again it requires custom coding.

Now they are supporting common protocols like walletconnect. This seems similar to "openid connect" in that it enables multiple wallets to join in for authentication. It means you don't need to customly code something for each wallet.

It's pretty interesting. Because now PK could act like that wallet, and designate vaults for the wallet connect protocol.

The walletconnect app link looks like this:

ledger://wc:c5882c94-d72c-4a26-a11c-5faa4adfa145@1?bridge=https%3A%2F%2Fk.bridge.walletconnect.org&key=f033dd371dcc64593e6a869123da90e4c19cddb07e983dd154314140121b194c
CMCDragonkai commented 1 year ago

Some notes on web3.

CMCDragonkai commented 1 year ago

Was talking with Cummins about supply chain trust for physical products like engines. It's broke down into 2 problems:

This results in a Hybrid Supply Chains. A physical supplychain and a virtualchain. Kind of a digital twin for physical supply chains. Which means diagnostics, reports, maintenance, authorised sign-offs.

Zero trust basically!

CMCDragonkai commented 5 months ago

@CryptoTotalWar created https://github.com/MatrixAI/Polykey-Enterprise/issues/23 to focus on passkey support.

I believe that passkeys and web3 wallet authentication are really the same kind of thing, but coming from different ecosystems. These 2 things should be investigated together in order to ensure alignment in any design for Polykey. That PKE issue is also misplaced, the information should be merged here.

CMCDragonkai commented 5 months ago

Corentin talked about this https://blog.millerti.me/2023/01/22/encrypting-data-in-the-browser-using-webauthn/

Outside of the world of webauthn (which is browser based), that's called hmac-secret extension in the CTAP protocol