Joystream / orion

Atlas backend
GNU General Public License v3.0
7 stars 14 forks source link

User Accounts #60

Closed bedeho closed 7 months ago

bedeho commented 1 year ago

Background

Currently there is no representation of users of any kind in the backend, this limits us from effectively doing a number of things

  1. Rate limiting & denial of service protection
  2. Rich off-chain Rich resource consumption accounting
  3. Enhance richness and veracity of off-chain public data: 3.1 consumption: who watched what part of what content when 3.2 followers: who followed whom when 3.3 profiling/who: https://en.wikipedia.org/wiki/Age/sex/location
  4. Gateway integration on consumption: https://github.com/Joystream/joystream/issues/2089
  5. Server-side services & personalisation

The moment we have accounts, it instantly raises the question of how people will authenticate.

Currently Orion requires the application on the user device to manages custody and signing with private keys. This means that the application developer will have to do considerable heavy lifiting if they want to deliver the familiar and low friction Web2.0 onboarding experience for new users while retaining the principle of secure self-custody of all assets ($JOY, $CRT, memberships, NFTs, etc).

The latter clause is really pivotal, because a custodial solution where the operator is in effect holding control over theses assets in a hot wallet has a number of serious downsides: a) operator takes on major cybersecurity and operational risk of not loosing access to funds, either by theft or loss. b) operator takes on possible current and future regulatory burdens of holding assets, which may or may not trigger money, payment or sanction related regulations across different users. c) raises the bar for any user to be willing to trust an app.

The goal above becomes really difficult when it comes to applications that run in the browser, as the only way to avoid the high friction external signer install and signing based model would have the user re-enter it on every session, which is very cumbersone.

Proposal

We introduce three distinct types of user accounts as described.

Ephemeral

Meant for blockchain read-only user experience, typically one time, without much server-side support for personalisation. Think of typical mode of going to Twitter, SoundCloud or Youtube while not logging in. No human involved in whatever registration or authentication exists, but these steps are still going to be required to make data more useful on server-side.

Off-chain

Meant for blockhain read-only user experience, typically lasting indefinitely, but with considerable server-side (read&write) support for personalisation. Think of typical model of logging in in the services above, but fully restricted in terms of publishing any content that is visible for other users. This means much richer set of personalisation on server-side will exist:

Authentication works using normal email+password combination.

Membership

Meant for full blockchain read-write user experience, typically lasting indefinitely, with full server-side support similar to off-chain accounts. Involves having an on-chain membership which has a corresponding off-chain membership server-side state, including email+password authentication information. This means additional set of personalisation features will exist:

Since a membership account also has a corresponding off-chain account, whatever API needs authentication as a member, e.g. managing favourited can use same credentials as off-chain case.

Key Management for Web 2.0 Signing

While we should certainly retain the option for membership based accounts to do their own client side key managment, new users can choose to create their account using a sort of hybrid security model like Hedgehog, which does tradeoff security of wallet itself, but radically enhances convenience, while retaining self-custody. This sort of sheme requires backend support which either is part of or works with Orion. Simplest approach would be to simply use existing email+password credentials for off-chain account. There may be room to agument or make this scheme more secure by compining Shamir-Secret-Sharing sharing where one piece is sent to user email, another is stored on server and third has to be remembered by user, like a PIN or something, but anyway, that may be overkill to begin with.

Users can then later export or graduate out of the less secure sort of membership to a full external wallet when the value at risk becomes sufficiently large, and the application can help them become aware of this risk over time.

Lezek123 commented 1 year ago

Orion v2 user accounts & authentication API draft

The draft mostly focuses on the implementation of the authentication API and a related database model, it is not a complete description of user accounts functionality in general.

Especially it doesn't fully account for:

Terms

The database model

User {
  # Unique identifier (32-byte string, securely random)
  id: ID!

  # Whether the user has root (gateway operator) privileges
  isRoot: Boolean!

  # The account associated with the user (if any)
  account: Account
}

Session {
  # Unique identifier (32-byte string, securely random)
  id: ID!

  # Browser (as deterimned based on user-agent header)
  browser: String!

  # Operating system (as deterimned based on user-agent header)
  os: String!

  # Device (as deterimned based on user-agent header)
  device: String!

  # Device type (as deterimned based on user-agent header)
  deviceType: String

  # User associated with the session
  user: User!

  # IP address associated with the session
  ip: String! @index

  # Time when the session started
  startedAt: DateTime!

  # Time when the session expires or did expire 
  expiry: DateTime!
}

Account {
  # Unique identifier (can be sequential)
  id: ID!

  # The user associated with the account (the account owner)
  user: User!

  # Account's e-mail address
  email: String! @unique

  # Indicates whether the account's e-mail has been confirmed or not.
  # The account is not accessible by the account owner until the e-mail is confirmed.
  isEmailConfirmed: Boolean!

  # Indicates whether the access to the user account is blocked
  isBlocked: Boolean!

  # User's password bcrypt hash
  paswordHash: String!

  # Time when the account was registered
  registeredAt: DateTime!

  # Membership associated with the account (if any)
  membership: Membership
}

enum TokenType {
  EMAIL_CONFIRMATION = 0,
  PASSWORD_RESET = 1
}

Token {
  # The token itself (32-byte string, securely random)
  id: ID!

  # Type of the token (its intended purpose)
  type: TokenType!

  # When was the token issued
  issuedAt: DateTime!

  # When does the token expire or when has it expired
  expiry: DateTime!

  # The account the token was issued for
  issuedFor: Account!
}

User entity

User entity is the most basic representation of a Client App / Orion v2 user, it can be either an anonymous user (have no related Account) or an account owner.

Each User has a securely random id (32-byte string) assigned on creation, which can be stored on user's device (for example, in Browser's local storage) or shared across multiple devices in order to authenticate the user using anonymous authentication and preserve some information about their activity on the platform.

A User can be associated with activities such as:

Example functionality that can be enabled for an anonymous User:

We may choose not to provide all of those features to anonymous Users, but it should be possible to at least collect activity data related to those features, which would later be preserved once the user creates an account (and becomes account owner), because of the User <=> Account association.

Importat: id of a User that has been associated with an Account can no longer be used to authenticate as anonymous user (ie. cannot be used for anonymous authentication).

Session entity

Session represents a period of activity of a User that interacts with the Client App or Orion v2 API directly, during which the user can perform authenticated requests (either as anonymous user or account owner) and access the GraphQL API.

For more information about sessions see Sessions and authenticated requests.

Account entity

An Account represents a user account which can be accessed by the user by providing a valid combination of an e-mail and a password.

When the user account is created, it is initially inactive until the e-mail address associated with that account is confirmed by the user.

The Account can also be associated with a Membership (see: Connecting account to a membership)

Token entity

Token represents a unique, securely random string generated by the Auth API for a given Account to allow:

Each token has an expiry date which depends on the Orion v2 configuration (see: Configuration variables).

Configuration variables

Auth API

The authentication api is a REST api, separate from the GraphQL API (the main Orion v2 api), which is being secured by it.

This approach can also be called out-of-band authenticaiton, to distinguish it from in-band authentiation, which would be an authentication implemented as part of the same GraphQL api that is being secured by it.

The OpenAPI schema draft of the Auth API can be found here

The autogenerated Markdown documentation can be found here

Anonymous authentication

Anonymous authentication is the type of authentication used for any user who either doesn't have an account or is not currently logged-in to their account.

In order to authenticate user as an anonymous user, the Client App should send a POST request to /anonymous-auth, optionally providing { "userId": "{locally_stored_user_id}" } in the request body, if it already has access to a locally stored userId.

Important: The userId provided for this kind of authentication cannot be associated with an existing account. This means that any userId stored in the local storage should be instantly removed after either:

In response to a successful anonymous authentication, the auth server sends back a JSON object containing:

Account owner authentication

For users who do own an account (account owners), a standard email-password authentication can be used. To perform it, a POST /login request should be made, including a JSON object in a body, which contains:

In response, a sessionId is returned which can be used to perform authenticated requests.

User account creation

New accounts can be created using /register endpoint. A POST request should be send with a JSON object in body, containing:

E-mail confirmation

Before a user will be able to access their newly registered account, they will first need to confirm their e-mail address. In order to do that they will have to provide a token that was sent to their e-mail address by Auth API.

The confirmation process on the Client app side can look like this:

Additonally:

Password reset

Account password reset is a two step process.

First the account owner has to request a password reset token, which will be sent to their e-mail provided during registration.

This can be done by sending POST /request-password-reset-token request with { "email": "{account's-email-address}" } in the body.

If the request was successful, an e-mail will be send to the provided e-mail address, containing a token which will allow the account owner to change their password.

In order to do this, a POST /reset-password has to be made, containing the following properties in the JSON object provided in the reuqest body:

Additionally:

The Client app can facilitate this process in the following ways:

Connecting account to a membership

An account can be connected to an on-chain membership, this is a 2-step process, which requires:

  1. Sending a MemberRemark transaction with a new metaprotocol message like:

    message RequestGatewayAccountBinding {
      optional string gateway_id = 1;
      optional string account_id = 2;
    }
    
    message MemberRemarked {
      oneof member_remarked {
        // ...
        RequestGatewayAccountBinding request_gateway_account_binding = 42; // Actual index not yet determined
      }
    }

    The gateway_id should be a standarized, unique identifier of a gateway (like Gleev, L1Media etc.)

  2. Sending a user-account authenticated POST /prove-membership request containing id of the memberRemark transaction issued in step 1.

Once this is done, Account.membership will be set, possibly unlocking features such as on-chain activity notifications etc.

Sessions and authenticated requests

Bearer authentication is used as an authentication scheme.

The access token in this case is a sessionId, which must be provided in the Authorization header, ie.: Authorization: Bearer <sessionId>.

Upon receiving an authenticated request, the server reads session information associated with the provided sessionId, either directly from the database (which is shared between GraphQL API server and the Auth API server) or from a memory cache.

Each session, besides being associated with a specific user (either an anonymous user or account owner), includes the following information:

This information is then validated on the server side. It is required that:

This basically means that ip, brower, os and device should not change during the course of a given session. In case any of those change, a re-authentication is required.

This solution makes it possible to track the activitiy of a given User more accurately and adds additional layer of security, as even a stolen sessionId would be useless unless the attacker can make requests from the user's ip.

The downside of this approach is the negative impact on UX for mobile users, as their ip address may change frequently if they're traveling, forcing them to keep re-authenticating. This means alternative approaches should be considered as well.

Session expiry

A session can expire:

Accessing the GraphQL API

All requests to the GraphQL api should be authenticated requests, regardless of whether they are queries or mutations.

Of course different requests may still require different privileges, ie. some mutations like setSupportedCategories will be only accessible by root user etc.

bedeho commented 1 year ago

Fantastic initial work, this is really comprehensive. I won't dive into all the details, as I have limited time, however

  1. Connecting account to a membership: why would put a transaction on chain for this? accounts are meant to be local concepts, they are not really portable across apps. I could see a future where information about users is shared with other actors in the ecosystem, e.g. to compile consumption data for use with payouts, however, this sort of public binding seems to not be useful. If its just about binding, just local signing interaction would do.
  2. Users will for the most be ephemeral, i.e. someone who just hit the site and don't sign up. But then, doesn't there need to be some sort of cross session authentication info, perhaps identifying the host also, so that one ephemeral user cannot mascarade as another?
  3. Would the scheme proposed here allow a user to prove to Argus that they indeed are a user of some app. Would verification of this require interaction between Argus and Orion either once per user or once per session?
Lezek123 commented 1 year ago

Ad. 1. I initially thought about just using a message signed by the member's controller account, but I thought it's a little bit less secure, since member controller account may change and Orion may be temporarily out of sync with the current chain state.

Ad. 2. The authentication key working across sessions, as described in the draft above, is the user id, which is a random 32-byte string stored in the browser's local storage. It should be enough to prevent users from stealing each other's identity.

Ad. 3. Since this isn't something I was considering, it would currently require interaction between Argus and Orion once per session.

Lezek123 commented 1 year ago

Re-posting a comment from Discort w.r.t. Hedgehog user authentication approach:

So in my opinion the encryption used for the artifacts being stored on the server looks solid and even seems like a bit of an overkill, although I'm not a cryptography expert. I think the weakest points of that scheme are not really related to encryption and are difficult, if not impossible to avoid while trying to provide a Web2-like UX, ie.:

  • storing the entropy in the browser's local storage,
  • ensuring user chooses a strong-enough password when creating an account.

In that regard, this approach is not really safer than using https://polkadot.js.org/apps without an extension for example, because PolkadotJS Apps at least require the locally stored seed to be encrypted with a password, which has to be provided before signing each transaction, while Hedgehog's approach is to just store an unencrypted seed. Of course we can choose to protect the locally stored seed with an additional password as well, but it's a tradeoff between convinience and security and it (again) all comes down to how secure of a password the user chooses to use in the end.

So the only real benefit of this approach as far as I understand is the fact that because the encryption artifacts are stored server-side, the user doesn't need to copy-paste, drag-and-drop or otherwise provide anything other than their username and password. As soon as we violate this rule, for example, by forcing the user to provide some additional random token that we send to their e-mail, the server-side complexity required to implement this solution no longer pays off in my opinion.

bedeho commented 1 year ago

Ad. 1. I initially thought about just using a message signed by the member's controller account, but I thought it's a little bit less secure, since member controller account may change and Orion may be temporarily out of sync with the current chain state.

I'm not able to really come up with a concrete problem scenario of what could happen, what effects would be, and why this fix solves it.

Ad. 2. The authentication key working across sessions, as described in the draft above, is the user id, which is a random 32-byte string stored in the browser's local storage. It should be enough to prevent users from stealing each other's identity.

Got it, makes sense.

Ad. 3. Since this isn't something I was considering, it would currently require interaction between Argus and Orion once per session.

Yeah this would be very useful to have.

bedeho commented 1 year ago

So the only real benefit of this approach as far as I understand is the fact that because the encryption artifacts are stored server-side, the user doesn't need to copy-paste, drag-and-drop or otherwise provide anything other than their username and password.

Agreed.

As soon as we violate this rule, for example, by forcing the user to provide some additional random token that we send to their e-mail, the server-side complexity required to implement this solution no longer pays off in my opinion.

This may be right, will have to look again.

Lezek123 commented 1 year ago

I'm not able to really come up with a concrete problem scenario of what could happen, what effects would be, and why this fix solves it.

One scenario could be:

  1. The event processor dies due to a bug
  2. The state in Orion's db becomes stale at block x
  3. Meanwhile a member (let's say Alice) finds out their controller key may have been compromised and changes the key.
  4. Orion is not aware of the change and accepts a signature from an attacker.
  5. Attacker's account becomes associated with Alice membership

Now if the only purpose of associating an account with a membership is to send some notifications, then it's not really an issue, but if there are other member-specific features in Orion this could be a problem.

The way on-chain transaction solves the issue is that in this case it's impossible to use a stale member controller key, as such transaction would be rejected by the runtime.

Lezek123 commented 1 year ago

Hedgehog-like key management overview

Terms explaination

Hedgehog encryption functions

Hedgehog encryption variables / constants

Name Variability Generation method Generated by Stored in Part of any request? Risks
username Different for each user Provided by the user (e-mail in Orion v2) User - Server's (Orion) database Yes Makes it easier to brute-force lookupKey and find cipherText associated with given user in case of database leak
password Unknown, depends on the users Provided by the user User Not stored No Gives full access to user's account and funds
entropy Different for each user Securely random bytes Client app (Atlas) - Browser's local storage No Gives full access to user's account and funds
cipherIv Different for each user Securely random bytes Client app (Atlas) - Server’s (Orion) database Yes If exposed together w/ cipherText, makes it possible to try to brute-force it to retrieve entropy
cipherKey Different for each user scrypt(password, cipherIv) Client app (Atlas) Not stored No If exposed together with cipherIv and cipherText, allows retriving the entropy and gaining full access to user's account and funds
cipherText Different for each user AES-256-CBC('hedgehog-entropy:::' + entropy, cipherKey, cipherIv) Client app (Atlas) - Server's (Orion) database Yes If exposed together w/ cipherIv, can be brute-forced (but the cost of each attempt is very high)
lookupKeySalt Constant Constant set to 0x4f7242b39969c3ac4c6712524d633ce9 in Hedgehog Hedgehog author (?) - Hedgehog library code No Makes it possible to try to brute-force lookupKey and use it to request encryption artifacts from the server. Mitigated by rate limiting.
lookupKey Different for each user scrypt(username + ':::' + password, lookupKeySalt) Client app (Atlas) - Server's (Orion) database Yes Makes it possible to try to brute-force user’s password, especially if username is known

Other questions

  1. Do we actually need cipherText? Why can't we just use something like cipherKey as a wallet seed?
  2. Can we avoid storing entropy in the local storage, as this is probably the weakest point of the scheme.

Database schema

In Hedgehog docs it is recommended for a database to follow a schema like:

type Authentication {
    lookupKey: String! @primaryKey
    cipherIv: String!
    cipherText: String!
}

type User {
    username: String! @primaryKey
    walletAddress: String!
    # ... other user data
}

There is some empahis being put on the separation between the two tables:

Username should be stored separately from auth artifacts in different tables. The table containing the authentication values should be independent with no relation to the table storing username

Why no relation between Authentications and Users tables?

The rationale given for storing the user and authentication data separately, and with no relation to each other, given in the official documentation is the following:

It's important that the username is not stored in the Authentications table because the lookupKey is a scrypt hash of a predefined iv with an username and password combination. If the data in these tables were ever exposed, susceptibility of a rainbow table attack could increase because the password is the only unknown property.

This part, however, is not very clear to me. Assuming the data in both tables was exposed (ie. the attacker gained access to the database), I don't think it would make much of a difference whether lookupKey, cipherIv and cipherText are stored together w/ user data or not, because:

  1. In that case (db leak) the attacker already knows all possible usernames and the public addresses of the wallets associated w/ those usernames.
  2. Since lookupKey is scrypt(username + password + lookupKeySalt), where lookupKeySalt is a hardcoded Hedgehog constant, easily accessible to the attacker, the only unknown input that remains is still the user's password. The attacker can compute scrypt(username + password + lookupKeySalt) by guessing different passwords for each chosen user (perhaps starting with those having highest amount of funds in the wallet) and if they guess a correct one, there will be a matching lookupKey in the Authentications table (there doesn't need to be any direct relationship between the tables).
  3. Once they get a matching lookupKey, they can easily retrive the related cipherIv and cipherText and decrypt the cipherText using a combination of cipherIv and scrypt(passwrod + cipherIv) (since they already known all the inputs at this point).

This attack vector is available regardless of whether the data in Users and Authentications table is stored together or separately.

Another attack vector is to try to guess scrypt(passwrod + cipherIv) that would decipher the cipherText. Since all cipherIv's are known to the attacker in case of a database leak, they can try different passwords for each record, and if the retrieved entropy can generate a wallet that matches one of the wallets in the Users table, it means they got a hit.

There are in general a few benefits and drawbacks in this approach that I can think of, but without knowing the details of what motivated Audius team to make this decision it's hard for me to make a judgement.

Benefits

Drawbacks

API

Registration

Audius/Hedgehog approach

Hedgehog uses 2 separate requests to create an account for the user:

This is another example of the emphasis being put on the separation of Users and Authentications data, even within the HTTP requests.

Orion v2 approach

We may choose to follow the approach above or alternatively make it a single request if we decide not to maintain the Users and Authentications data separation.

Authentication

Audius/Hedgehog approach

The way Authentication is implemented in Audius (which relies on Hedgehog) is that the account ownership is proved by the user by signing a message using the private key associated with the public key stored in the User table.

The flow is therefore:

There is no concept like Session in Audius, and there is no time limit for which a given signed message remains a valid proof of account ownership.

Orion v2 approach

We should probably no longer have the user send POST request which includes a plaintext password in order to authenticate, as this would lead to loss of funds if the server was ever to be breached by the attacker.

Therefore I see 2 potential approaches:

Client-side hashing

Generally comes down to this:

Now if we didn't have Users-Authentications data separation, we could just authenticate a user based on the lookupKey, since lookupKey is effecively scrypt(username + ':::' + password, salt), so noone who doesn't know the user's username and password should be able to provide one.

Benefits:

Drawbacks:

Message signing

We could also use an approach similar to the one Audius uses, however I think that the message the user is required to sign should have a specific structure and that a given signature should only last for a given period of time to increase security.

I imagine a following interaction between the client app and Orion:

  1. Client app sends an initial request, providing either the public key or the username of the user it wishes to authenticate.
  2. Orion responds with a challenge, like a random string that needs to be signed
  3. Client app signs the string and sends it back to Orion, Orion can now initiate a new Session for the user

Alternatively:

  1. Client app sends a signed message to Orion, containing, for example, a hash of a block no older than X seconds
  2. Orion verfies that the hash of the block provided is indeed no older than X seconds (it's possible to do this quickly, since Orion v2 includes a local processor which could store this information in a shared, local Redis database) and initiates a new Session

Benefits:

Drawbacks:

Forgot password functionality

Hedgehog library

Hedgehog library documentation states that resetting a password without losing access funds is not possible, ie.:

If a user loses their password, the account is no longer recoverable. There's no way to reset a password because the entropy is encrypted client side before it's sent to the database. And since the old password is required to decrypt the entropy and re-encrypt with a new password, if the password used to encrypt the entropy has been lost or forgotten, the account is not recoverable.

Audius

Regardless of what the Hedgehog docs state, Audius actually allows resetting a password (https://github.com/AudiusProject/audius-protocol/pull/129) via an e-mail recieved during registration, which contains base-64 encoded entropy.

Note that this contradicts other security practices implemented in Hedgehog. No other request made to the server, except the request which generates this password recovery link, contains entropy or password in any form that can be easily decoded. This is a huge shift in security level, as it means an attacker who gains access to the server would be able to access funds in any account created after the breach without doing any expensive computation, which wouldn't have been possible otherwise. Moreover, because this data is sent to Sendgrid (e-mail delivery api) and then user's e-mail, this opens up a few new other attack vectors.

Orion v2

I think in Orion v2 / Atlas we don't want to follow this approach, but instead either:

Connecting external wallets

Audius

In Audius a user can sign a message containing their Audius user id, from another wallet, in order to connect it to their account. However, the wallet used for the authentication message-signing will continue to be the in-browser Hedehog wallet.

Orion v2

I think in Orion v2 it would make sense to connect/disconnect specific addresses to an account the way it's done in Audius instead of trying to tie Orion v2 account to a specific membership, as I suggested before.

This approach seems more generic, as memberships as well as other roles held by the user can then be derived from connected addresses instead. It would also allow us to more naturally handle controller account changes etc.

If we choose Message signing as an authentication method, I think it would also be wise to let the user choose which of the connected accounts they want to use to authenticate in Orion, instead of forcing the use of not-so-secure in-brower wallet for this purpose.

Final conclusions

bedeho commented 1 year ago

Let me start off with a basic question which possibly may sound stupid: but why is there even a backend in this design? If one leaves out the clearly absurd password reset deviation, what is it actually doing? Couldn't one just have entropy derived from user inputs and call it a day? what benefit would be lost.

I have a hypothesis, but I think asking a basic question like this may help either invalidate that, or


I frankly found it hard to develop an integrated mental model of this wallet architecture, so tried to summarise it all visually based on your excellent detailed examination of the implementation, which helped me a bit. Including here, feel free to correct me if something is wrong here.

Screenshot 2023-03-28 at 00 23 10

I agree that the password reset thing is not advisable for us.


Do we actually need cipherText? Why can't we just use something like cipherKey as a wallet seed?

This would mean that if someone captures the database, with the clear text cipherIv, the only security left is in the password, which most of the

Can we avoid storing entropy in the local storage, as this is probably the weakest point of the scheme.

My understanding is that local storage is just to persist sessions across window sessions. If the entropy was just stored in browser memory, the user would have to re-enter it every time the open the window again. I think that benefit is important, but I would still say its less than half of the benefit, so to speak.

How much security would that buy us in the end? Would XSS still not be an equally large risk in this case?


There seem to be a few contradictions in the Hedgehog security model, where some practices seem overly cautious, like storing Users and Authentications data in separate tables even though there is no obvious attack vector there@

This puts a lot of responsibility on the Atlas side to make sure there's no possibility of an XSS attack, which would require a lot of care with the management of the app codebase and its dependencies.

Yes, this is an issue, I've asked about this here, hard for me to judge this issue: https://github.com/Joystream/atlas/issues/3986

In general I think we sould treat a Hedgehog-like wallet as very insecure, temporary onboarding-only wallet not suitable for storing any amount of JOY and/or NFTs / CRT, that the user is going to care about losing.

I agree with that, very much. The treatment in Atlas in terms of password strength, warnings and even locking down features for signer only mode does give us some room to work with. Of course, you cannot prevent people from changing this, but then again, you can't prevent someone from just building this themselves either, or some 100% custody based solution.

bedeho commented 1 year ago

I guess people lose access to their funds if backend is lost, or alternatively then, backend can do extortion attack?

Lezek123 commented 1 year ago

Let me start off with a basic question which possibly may sound stupid: but why is there even a backend in this design? If one leaves out the clearly absurd password reset deviation, what is it actually doing? Couldn't one just have entropy derived from user inputs and call it a day? what benefit would be lost.

I think the main reason is that with no backend we can only work with username and password, while having a backend allows us to use some additional, randomly generated values (like cipherIv) and store them on the backed.

The problem of deriving just from username and password client-side is that we cannot limit an attacker (with server-side rate limiting for example) w.r.t. how many attempts per second they can do when trying to guess some valid credentials (they don't even have to target a specific user in this case) and re-creating the seed from them. The attacker can make guesses completely independently from any flow in the app, just by running a script on their machine. We could use a high-cost hash like scrypt to derive the seed to slow down this process, but the cost cannot be too high, as it has to be adjusted so that users with lower-spec hardware don't have to wait too long to sign in. On the other hand, the attacker can have a very specific ASIC-miner like hardware to do the brute-force and possibly calculate millions or billions such hashes per second.

However with the backend that stores additional random value for each user (like cipherIv) which is required to generate the seed, it would be completely infeasible to try to brute-force the seed without having access to this value, as one would have to guess both the random value (There are 3,402823669×10³⁸ possibilities for 16-byte values) and the password. And the server requires valid user credentials to be known by the actor that is making the request to get cipherIv and most importantly can enforce rate limiting on the endpoint that provides it.

I frankly found it hard to develop an integrated mental model of this wallet architecture, so tried to summarise it all visually based on your excellent detailed examination of the implementation, which helped me a bit. Including here, feel free to correct me if something is wrong here.

Looks correct to me

This would mean that if someone captures the database, with the clear text cipherIv, the only security left is in the password, which most of the

I think the sentence got cut-off here.

My understanding is that local storage is just to persist sessions across window sessions. If the entropy was just stored in browser memory, the user would have to re-enter it every time the open the window again. I think that benefit is important, but I would still say its less than half of the benefit, so to speak.

How much security would that buy us in the end? Would XSS still not be an equally large risk in this case?

I found this interesting article about how vulnerable different approaches of storing data in the browser would be to XSS: https://auth0.com/blog/secure-browser-storage-the-facts

Based on that I think using an isolated WebWorker thread to just store the seed (in memory) and sign transactions, seems like a much safer alternative to local storage, however of course the drawback is that the user would have to provide credentials every time they refresh the page, open a new window/tab etc. (which I belive would happen quite often) in order to access their wallet.

This is exactly the problem I was trying to address with the solution I'm going to describe in the next comment.

I guess people lose access to their funds if backend is lost, or alternatively then, backend can do extortion attack?

This is probably the only good thing about storing the seed in the local storage, since in this case it also serves as an "enforced" backup (at least until the user clears the storage), although I think it would be better to give the user an option to backup the seed themselves and store it somewhere safer, ideally in password-encrypted form, as a last resort in such case.

Lezek123 commented 1 year ago

Hedgehog-alternative solution

I was recently thinking about making a safer, alternative in-browser wallet implementation which would rely on HttpOnly cookie authentication and require a communication with the server in order to decrypt the seed. In this solution the seed would be stored in local storage only in an encrypted form and be decrypted only inside an isolated Web Worker thread, for the purpose of signing transactions.

Considerations

There are a few things to note about this solution:

How it would work:

Here I'll be describing the approach where user data is stored together with encryption artifacts, not separately like in Hedgehog. I chose this approach because it's easier to reason about, however the solution I'm describing can be adjusted to follow the User---Authentication data-separation rule as well, it doesn't inherently rely on the assumption that the data is stored together.

Registration

  1. When user registers a new account and provides username and password, Atlas uses an isolated Web Worker thread to generate:

    • seed: a wallet seed (can be some safely random bytes)
    • cipherIv1, cipherIv2: securely random initialization vectors to use for aes-256-cbc or similar encryption function
    • cipherKey1: A key generated randomly. It will be stored server-side only.
    • cipherKey2: A key created deterministically from password or a combination of username and password, similar to cipherKey in Hedgehog, for example: scrypt(username + password, cipherIv1)
    • key1EncryptedSeed - a wallet seed encrypted using aes-256-cbc(cipherKey1, cipherIv1)
    • key2EncryptedSeed - key1EncryptedSeed additionally encrypted using aes-256-cbc(cipherKey2, cipherIv2)
    • passwordSalt - can be another securely random string
    • passwordHash - scrypt(password, passwordSalt)

    And send all the values except key1EncryptedSeed, cipherKey2 and password to the server (those three values should never reach the server!)

  2. Upon recieving the registration request, server saves all this data in the database (for example, in account table).

image

Authentication

  1. At any time user can authenticate using Atlas by providing username and password. Atlas then calculates passwordHash client-side, before sending it to the server to authenticate the user. Plain password should never be sent to the server as it would allow the server to decrypt the wallet seed, which we want to avoid. The password in plain form, however, should be temporarily stored in memory of an isolated Web Worker thread, which will have to use it for the first round of decryption as described below.
  2. The auth server then sets an HttpOnly session cookie upon successful auth. The cookie is used to then identify the user in the subsequent requests coming from the app. The benefit of HttpOnly cookies over local storage is that they are more XSS-resistant, ie. the browser doesn't allow JavaScript code running on the page to access those cookies, they are fully managed by the browser itself. However, they are sent to the server (provided it lives under the same domain) which each request.
  3. The app, more specifically the Web Worker thread having access to user's plain password, then requests cipherIv2 and key2EnryptedSeed belonging to the authenticated user from the server. Having both the password and cipherIv2 (potentially also username), this thread can then decrypt key2EnryptedSeed into key1EncryptedSeed. key1EncryptedSeed can actually be stored in local storage quite safely, as it is still encrypted with cipherKey1 and cipherIv1, which won't be stored client-side.

image

Tx signing

  1. Atlas asks a Web Worker thread to sign a transaction
  2. An isolated Web Worker thread requests cipherKey1 and cipherIv1 belonging to the authenticated user from the server and uses them to decrypt key1EncryptedSeed stored in the local storage. The decrypted seed shouldn't be stored anywhere except temporarily in Web Worker's memory (to sign the transaction), which should be pretty safe from XSS attacks (unless there's some serious vulnerability in the Web Worker itself, but it would be way easier to audit). Anytime a transaction needs to be signed, a new request for cipherKey1 and cipherIv1 can be made to the server.
  3. The Web Worker can now provide the signed transaction back to Atlas

image

Summary

So to sum it up, in this approach:

bedeho commented 1 year ago

Let me just respond to this first in a separate comment

I think the main reason is that with no backend we can only work with username and password, while having a backend allows us to use some additional, randomly generated values (like cipherIv) and store them on the backed.

Yes I think this story is correct. This does mean that there is a risk of loss of funds if data is lost, so I think you are right that users shuold really be encouraged to write down this at some point, and it has to be clearly explained to them what the various risks are at the appropriate time, so they dont believe that the server will always be able to help them.

bedeho commented 1 year ago

Purpose

The point of this reply is to share my perspective on the proposal, and also to allow future third parties benefit from this perspective or description going forward, and lastly to make sure we are on the same page.

Preamble

Really fantastic to see your initiative here, there are many moving parts in the original standard, and it does make sense to explore if a trade off more suitable for our use case is available, while borrowing the core idea in the original proposal.

It took me a really long time to parse this idea, but this is very clever. I am really impressed by how creative you have been here, assembling just the right mix (seemingly) of tricks to not do any worse than Hedgehog, and possibly make an XSS attack substantially harder. I am nervous about the very well known asymmetry in the ease with which to suggest new security schemes compared to finding their flaws. The ideal is of course to define a precise security model and then positively demonstrate that certain specific security properties hold, but that is too much work for us now, but its probably wise to have an auditor or outside consultant do that exercise for us. This may help us get more specific exactly we want, and what we are willing to give up.

Architecture

Again, its both hard to see the big picture in how all these values are related, and also make sure there are no misunderstandings, so while trying to wrap my head around this I had to make this diagram, please let me know if it is incorrect.

user_accouns_alternative(1)

Core Idea

My understanding of the the objective here to solve the following problem:

In the Hedgehog standard, the way a user is able to continuously initiate transactions is by storing the clear text entropy in local storage. This means that both between sessions, and within a session, the user can sign transactions at will without requiring user to re-enter any credentials. One implication of this is that an XXS attack will be able to read this data from the user client side, thus having control over funds.

In this new standard, there is no such storage, instead you

a) use local storage to persist high security cipher text of seed. It must be stored client side during authentication, because it depends on clear text password to compute. b) use httpOnly to persist credential to obtain key for cipher text in a). c) decrypt cipher text in a) using confidential data obtained using credential in b).

I am reasonably certain this is at least as secure as Hedgehog in every respect, and if there is a mistake it should be fixable. Server still

Questions

The Role of the Web Worker

I was not very familiar with webworkers or their security properties, so I read this article you kindly provided: https://auth0.com/blog/secure-browser-storage-the-facts/#Web-Workers-Help-Maintain-Secure-Browser-Storage

My understanding is that the security benefit of such a worker is that if the XXS exploit occurs after the worker has been started with the appropriate secret data, then compromised code cannot read this data inside the worker from that point on. There may also be a UX benefit of having compute intensive crypto operations not block the UI powered by the main thread.

XXS attack model

In a certain sense, XXS attacks executed through a compromised dependency, seems to be a totally unconstrained attacker. For example, the simplest attack would be to just do some weird thing in the UI where you fool the user to think they were logged out or timed out or something, and prompt them to re-enter the username+password, at which point everything is over. No one would even notice this, they would just keep on using the app, and this could be done at scale harvesting credentials of a large number of users over a long period of time, until one day it is exploited in a short period of time. Or, if the attacker can execute code before login occurs, you would be in the same situation. I'm not exactly sure what would determine when an attacker at the earliest can run their code, if they actually have compromised a dependency, would that depend on nuances about how everything is packaged up as an app and delivered to the user perhaps?

Conclusion

All in all, I think this is quite a bit more complex than Hedgehog in terms of what must be built, but conceptually not so much more, and apart from the a greater risk of making practical mistakes (famous last words), I did not see any actual security tradeoff compared to Hedgehog, but I also am not really an applied crypto person by any stretch.

I think we would be well served by double-checking the whole rainbow attack consideration from Hedgehog, just to make sure we understand that they are either mistaken, or that we also need to accommodate this consideration.

Lezek123 commented 1 year ago

Do you actually need to encrypt the seed twice? the only thing securing it at the server level is the password, so why not just store cipher based on password encryption, then on client side the encryption with separate key is done before dumping in local storage during authentication. Seems simpler, at the very least to understand.

To make sure I understand this correctly, let's define k1 as the randomly generated, server-stored key and k2 as the password-derived key. I'll skip the ivs for simplicity.

The way I understood your suggestion is that on the server-side we can store k1 and the k2-encrypted seed. Then during the authentication:

  1. Client receives k1 and k2-encrypted seed.
  2. Since k2 is derived from the password, which the user will provide during authentication, the client can decrypt the k2-encrypted seed
  3. The client can then re-encrypt the seed using k2 which it got from the server in response to a successful auth.
  4. The client can then store k1-encrypted seed like in my original proposal

I think that makes sense and perhaps it's easier to reason about indeed.

Are we sure it is OK to reuse the initiation vector building second cipher text? I don't know.

In my original proposal I use 2 separate vectors (iv1 and iv2). Are you asking if it's ok to use one instead? To be honest I'm not very familiar with the details of the risks associated with reusing the same iv with AES-256-CBC provided that the keys are different, but from what I know this is not recommended and even if the impact on security is relatively low, I wouldn't risk it, as the cost of using separate ivs is so negligible.

In a certain sense, XXS attacks executed through a compromised dependency, seems to be a totally unconstrained attacker. For example, the simplest attack would be to just do some weird thing in the UI where you fool the user to think they were logged out or timed out or something, and prompt them to re-enter the username+password, at which point everything is over. No one would even notice this (...)

Actually I think it's quite hard to pull off something like that in a convincing way and without anyone noticing. It all depends on how the authentication flow is normally handled in the app, but unless the attacker is able to mimic it 1:1 (or close enough to that), there's a high chance that an experienced user will notice that something seems off. Also realistically, if a popular dependency is compromised, adding code which does something like this would probably be spotted quite quickly by someone not even necessarily associated with Gleev in any way (ie. any other consumer of this dependency).

Another thing about compromised dependencies, which I think is essential, is that unless it's not a well established/recognized dependency (and we should strive to only use dependencies like that), we would be just one project among thousands that the attacker could choose to target once they compromise the dependency. I assume in that case the attacker would choose an expolit that would work well across many different high-value targets and has a minimal chance of being discovered / spotted. Something like stealing data from local storage, all non-http-only cookies and globally defined variables could probably work quite well across different frontend apps for example. Although we can't rule out the possibility that someone would choose to target Gleev specifically in that case.

My understanding is that the security benefit of such a worker is that if the XXS exploit occurs after the worker has been started with the appropriate secret data, then compromised code cannot read this data inside the worker from that point on.

The way I understand it, if somewhere in the app there is an XSS vulnerability, the attacker can then use all sorts of "tricks" to gain more control over the app. Besides just executing some code right away, they can, for example, override some global / higher-scope functions like window.fetch or document.getElementById, which would allow them to execute malicious code whenever those functions are called. In that sense, even if there's a vulnerability in a very specific place in the app, like on a video view page, it can easily spread to other parts of the app as the user navigates through it. One benefit of calculating and storing seed only in Web Worker's memory instead of using the main thread, is that since it's a completely separate & isolated environment, it won't be affected by any such overrides, so the data we keep there is generally safer.

However, as you mentioned, the attacker can capture sensitive data, like user's password for example, before it's even passed to a WebWorker, which is a valid concern. Another concern I have is that in the end the attacker can just re-do all steps normally executed from within the web worker, like making request to the server to get k2 and then using it to decrypt the local storage and it's difficult to prevent this.

So which thinking about this, another idea I came up with was to create a separate, minimalistic page, like secure.gleev.xyz.

This page would be the only page where the user would ever enter the password. We can also make sure this is communicated well to reduce the possibility of users being fooled if an attacker expolits an XSS on gleev.xyz to show a fake login modal for example. The key here is that we can make secure.gleev.xyz a small-unit, isolated page w/ highest security standards (like a very restrictive Content-Security-Policy etc.), which will be much easier to maintain and audit and only have the following responsibilities:

  1. Handle login / registration.
  2. Perform client-side articacts encryption/decryption.
  3. Sign transactions upon user confirmation.

The flow I imagine in that case would be the following:

  1. User logs in at secure.gleev.xyz. At this point the HTTP-only, SameSite: Strict session cookie is set on gleev.xyz domain and the user gets redirected to gleev.xyz.
  2. The session cookie allows gleev.xyz to perform authenticated requests to api.gleev.xyz, but neither wallet seed nor the user's password does ever enter gleev.xyz app in any way. Those secret values are only ever processed by secure.gleev.xyz.
  3. In case gleev.xyz needs to send a transaction on behalf of the user, it must request a signature by opening a popup like: secure.gleev.xyz?request_signature={encoded_transaction}. In this popup window, the user needs to confirm their signature, just like they would normally do via a browser extension. The point of this is that if there's ever an XSS vulnerability on gleev.xyz which doesn't affect secure.gleev.xyz, the attacker wouldn't be able to issue transaction on behalf of the user without the user confirming it, making such attack impractical.

This is a basic high-level overview of how I imagine we could maintain a good separation between gleev.xyz, being a huge react app with very advanced functionality and much higher risk of XSS vulnerability, and the part of the app responsible for the most sensitive interactions, like authentication and transaction signing.

The main drawback of this approach I can think of is the implementation cost of it, which seems to be much higher than with the other approaches described. However, I think it provides some solid security improvements which I my opinion make it worth considering.

bedeho commented 1 year ago

I think that makes sense and perhaps it's easier to reason about indeed.

Excellent.

In my original proposal I use 2 separate vectors (iv1 and iv2).

I am referring to the fact that IV_2 is used to generate sk_2 and ct_2 , in diagram above. As you see, IV_1 is not used in this dual way, and I'm just wondering if there is any risk in that.

If there is no practical cost of introducing a new IV, perhaps just doing that would be safest approach?

Also realistically, if a popular dependency is compromised, adding code which does something like this would probably be spotted quite quickly by someone not even necessarily associated with Gleev in any way (ie. any other consumer of this dependency).

The compromise may just involve allow execution of some other code which can be fetched on demand or something, it does not need to involve inlining all the details of a Atlas specific attack.

I take your point though, a sufficientl alert person would start to ask questions.

So which thinking about this, another idea I came up with was to create a separate, minimalistic page, like

Just to make sure I understand the baseline idea here

  1. when secure.gleev.xyz?request_signature={encoded_transaction} is invoked from compromised main window, it is not possible for that code to send fake user input signals to this secure window, for example while hiding it in the background or something?
  2. as we talked about on the call today, if we wanted to allow certain transactions to be signed without a popup, the way that would work would be that the main window would know what set of transactions qualified for such no-popup treatment, and for those it would make a pure API call to secure.gleev.xyz?request_signature={encoded_transaction}, where on the server side of this call, the same integrity check is enforced by inspecting encoded_transaction, and a signature is returned if the check passes. In principle it could be configured by the operator what sort of policy the app+orion should be enforcing.

My view on this is

  1. I do indeed believe this would work.
  2. It is substantailly more complex to implement.
  3. Going from the proposal you made just before this new one to this new one would not seem to generate a lot of dead weight work compared to just going for this right away, so in essence an incremental approach seems cheap.
  4. If someone starts to have value at risk, they will be encouraged to install a signer, and in principle one could have a hybrid approach with signer+non-signer keys to achieve same result as this secure popup scheme, and I don't know if UX would be worse than this, it may even be better if it perhaps allows sidestepping various popup blocking software that may exist in browsers, anti virus sfotware etc.
  5. It's hard to anticipate how the product will be to use when it starts to become more complex to understand when something will pop up and not, explaining that, and having the user remember and deal with it, may have varying degrees of success. It may just be easier to deal with making a clean transition from never pop-up to always pop-up. It's hard to say, perhaps more prototypes of UX could shed some light, but that would require quite a bit more work to achieve.
  6. It's hard to rank, in the overall set of risks in this scheme, how high exactly to rank the difference in XSS deterrence between these two proposals. I could see a totally different set of problems being far more important than this in practice for example, like operator rugging.

For these reasons, I think its probably best to just stick with your original proposal above. Yes we don't know exactly how to quantify how capable an XXS attacker will be, but you have already done a good job with what you proposed as a starting point, and I think we are on net best served just moving forward with this now.