tkhq / qos

QuorumOS is a computation layer for running applications inside Trusted Execution Environments (TEEs)
GNU Affero General Public License v3.0
5 stars 2 forks source link

ENG-1627: Upgrading `borsh` Dependency to `V1.0` #458

Closed besler613 closed 5 months ago

besler613 commented 5 months ago

Summary & Motivation (Problem vs. Solution)

This PR upgrades the borsh dependency from V0.1 to V1.0 in response to the security vulnerability identified by the Github Renovate bot in PR 2450 of the mono repo. V1.0 involved several breaking changes, primarily related to the vector serialization of the BorshSerializer and BorshDeserializer, so this PR also includes code changes which accommodate those breaking changes.

Reference to Linear Issue (ENG-XXX)

ENG-1627

How I Tested These Changes

Pre merge check list

N/A - CHANGELOG.md does not require update at this time because we are not consuming qos in a version-controlled way.

linear[bot] commented 5 months ago
ENG-1678 Update borsh to v1

This is a security related upgrade because there's ambiguity when dealing with zero-sized types: more context on the PR: [https://github.com/tkhq/mono/pull/2450](https://github.com/tkhq/mono/pull/2450) This has been blocked for some time since we were on rust 1.63. But recently we've bumped Rust to 1.76 so we can finally do this! Work to do here: bump borsh (should be done already, renovate takes care of this β€” if not you can click "rebase" to retry). Then make CI happy by adding commits to the branch if necessary (maybe the linter or formatter will complain, maybe the generated code will change as a result of the bump, etc). Once that's done, get a review and merge πŸš€

cr-tk commented 5 months ago

Some initial observations:

As expected, this version bump causes a lot of other version bumps in transitive dependencies, including minor-version and sometimes major-version increases in the lockfile. I don't think this can be avoided and is part of working on the tech debt of not updating dependencies in a long time due to our MSRV lock, but since this is QOS code that will run in all the enclaves, we should at least make some effort to figure out what we're pulling here.

@besler613 for more context - reviewing the effective code-level diffs of dependency upgrades of the code we effectively trust with our (enclave) secrets is one of those costly things that the software industry does far too little, and we are also not really familiar with yet, but should do more if we can.

Looking at the Cargo.toml changes, I'm also wondering about the potential use of default-features = false for borsh. However, the current default feature seems to be only std, so no potential improvements there (unless we want to be extra explicit everywhere and avoid potential future additional default features).

besler613 commented 5 months ago

Some initial observations:

As expected, this version bump causes a lot of other version bumps in transitive dependencies, including minor-version and sometimes major-version increases in the lockfile. I don't think this can be avoided and is part of working on the tech debt of not updating dependencies in a long time due to our MSRV lock, but since this is QOS code that will run in all the enclaves, we should at least make some effort to figure out what we're pulling here.

@besler613 for more context - reviewing the effective code-level diffs of dependency upgrades of the code we effectively trust with our (enclave) secrets is one of those costly things that the software industry does far too little, and we are also not really familiar with yet, but should do more if we can.

Looking at the Cargo.toml changes, I'm also wondering about the potential use of default-features = false for borsh. However, the current default feature seems to be only std, so no potential improvements there (unless we want to be extra explicit everywhere and avoid potential future additional default features).

@cr-tk Thank you for these observations! AS for the question of default-features = false vs features = ["derive"], I must admit that I'm completely new to rust and did not give any thought to the functional differences between these dependency specifications. However, as a general rule, it seems like the most restrictive dependency specification (which still provides the functionality we require) is the one we should go for, and it sounds like the default-features = false option might fit that bill. If you'd like, I could experiment with using default-features = false instead of features = ["derive"]?

cr-tk commented 5 months ago

@besler613 as far as I can see, the decision would be between

borsh = { version = "1.0", features = ["derive"] }

Which pulls in the default feature (std, as defined upstream) since we haven't told it not to, plus the explicitly picked additional feature derive.

And

borsh = { version = "1.0", features = ["std", "derive"] , default-features = false}

Where we tell it to explicitly pick the necessary + desired ones and don't do anything else that's normally the default.

I'm also no Rust wizard, but expect the latter version to fail more explicitly (or continue working without pulling in additional functionality) if upstream changes the default features on us as part of some version upgrade. This comes with additional maintenance and spelled out complexity though, so I think we're usually not doing it everywhere, but only in places where the default features already have something we want to avoid or that otherwise doesn't fit us. Since borsh currently has usable defaults, that would mean going with the former version without overrides.

Perhaps @r-n-o or @emostov can confirm this as the more experienced Rust developers :slightly_smiling_face:

linear[bot] commented 5 months ago
ENG-1627 Invalid stamp results in `public key could not be found in organization` error

Valid request: ``` curl -X POST -d'{"organizationId": "ebe86d74-c89d-4c34-943e-f31e5a257654"}' -H'X-Stamp: eyJwdWJsaWNLZXkiOiIwMjhjNjA4MzM1ODJjOGNkNTAwNjRkODI4ZDY1YzJhNzgzYjNhN2M5NGM2ZGZjYmFkN2U2OGUwNzhlYzY5YTY1ODQiLCJzaWduYXR1cmUiOiIzMDQ1MDIyMDBkODFhOWVhZmIyMDY1NTQzOTZlODllOGVlYWQyNmU1M2EyNTIyZjk1ZDNlNzhjYmViN2EwMjUwMTVmOWFlYmIwMjIxMDBiYjdhZDAxY2QzZjcwNGMzZDI1NWEzNDM2OTAxMTM4YjIwZjQwZDRkNTY2ZGNmOTc4YTliYzUxMzAwNTZlMTAxIiwic2NoZW1lIjoiU0lHTkFUVVJFX1NDSEVNRV9US19BUElfUDI1NiJ9' -v 'https://api.turnkey.com/public/v1/query/whoami' ``` Now, modify the payload to invalidate the stamp: ``` curl -X POST -d'{"organizationId": "ebe86d74-c89d-4c34-943e-f31e5a257654", "foo": "bar"}' -H'X-Stamp: eyJwdWJsaWNLZXkiOiIwMjhjNjA4MzM1ODJjOGNkNTAwNjRkODI4ZDY1YzJhNzgzYjNhN2M5NGM2ZGZjYmFkN2U2OGUwNzhlYzY5YTY1ODQiLCJzaWduYXR1cmUiOiIzMDQ1MDIyMDBkODFhOWVhZmIyMDY1NTQzOTZlODllOGVlYWQyNmU1M2EyNTIyZjk1ZDNlNzhjYmViN2EwMjUwMTVmOWFlYmIwMjIxMDBiYjdhZDAxY2QzZjcwNGMzZDI1NWEzNDM2OTAxMTM4YjIwZjQwZDRkNTY2ZGNmOTc4YTliYzUxMzAwNTZlMTAxIiwic2NoZW1lIjoiU0lHTkFUVVJFX1NDSEVNRV9US19BUElfUDI1NiJ9' -v 'https://api.turnkey.com/public/v1/query/whoami' ``` …and you get the **very** misleading error: ``` public key could not be found in organization or its parent organization organizationId=ebe86d74-c89d-4c34-943e-f31e5a257654 publicKey=028c60833582c8cd50064d828d65c2a783b3a7c94c6dfcbad7e68e078ec69a6584 ``` moe, andrew and I banged our heads against this for hours. This should error out with something closer to "invalid signature"! Putting this in "ongoing customer request" because we are customers! (and more seriously, this is very likely going to cause customer confusion)

besler613 commented 5 months ago

@besler613 as far as I can see, the decision would be between

borsh = { version = "1.0", features = ["derive"] }

Which pulls in the default feature (std, as defined upstream) since we haven't told it not to, plus the explicitly picked additional feature derive.

And

borsh = { version = "1.0", features = ["std", "derive"] , default-features = false}

Where we tell it to explicitly pick the necessary + desired ones and don't do anything else that's normally the default.

I'm also no Rust wizard, but expect the latter version to fail more explicitly (or continue working without pulling in additional functionality) if upstream changes the default features on us as part of some version upgrade. This comes with additional maintenance and spelled out complexity though, so I think we're usually not doing it everywhere, but only in places where the default features already have something we want to avoid or that otherwise doesn't fit us. Since borsh currently has usable defaults, that would mean going with the former version without overrides.

Perhaps @r-n-o or @emostov can confirm this as the more experienced Rust developers πŸ™‚

As mentioned here, I've replaced all borsh dependency specifications with the more restrictive form suggested by @cr-tk.