stacksgov / sips

Community-submitted Stacks Improvement Proposals (SIPs)
132 stars 80 forks source link

SIP-023: Emergency Fix to Trait Invocation Behavior #131

Closed kantai closed 1 year ago

kantai commented 1 year ago

This SIP proposes a consensus-change to fix the trait invocation bug in Epoch 2.2

wileyj commented 1 year ago

I think it would help to add some of the workarounds attempted (briefly), and how this may or may not affect the second hard fork accepted in SIP-022

tycho1212 commented 1 year ago

Everyone who wants Stacks to work & exist should support this SIP

muneeb-ali commented 1 year ago

Thanks for putting this together quickly! In my mind this is more of a bug fix to SIP-022 than a new SIP but I understand that how the current SIP process is written this needs to be a separate SIP.

Given that the SIP-022 bug is impacting applications like ALEX, Gamma, Arkadiko and others, reducing the overall time it takes for activation is very helpful.

Separately, once the upcoming SIPs and upgrades are done in coming weeks I might propose that we start looking for a different process for quick bug fixes vs full SIPs (that are changing/adding functionality). We don't need to discuss that here, just giving a heads up 🙏

Jamil commented 1 year ago

+1 on this — it seemed like wrapping the contract calls did not work as a workaround; even though the transactions were accepted by the mempool, there was a runtime error with (presumably) the same root cause: https://explorer.hiro.so/txid/0x92c0ee32f5719c157ccb60d28e1b5426877a07bbd305910eca3c6d75c946c0f2?chain=mainnet

thanks @obycode for the rapid response to this, many applications (including ours, Gamma) are not functional until activation & implementation of this SIP. This is one of those existential moments for a blockchain!

Full support behind it.

Hero-Gamer commented 1 year ago

For information completeness anyone reading this SIP for the first time, I'd like to recommend inclusion of reference to the draft Catastrophic SIP guideline, similar to what SIP-022 doc referenced. And it gives everybody chance to read the Cata SIP to understand why it is used again this time.

ChienteH commented 1 year ago

+1. This issue is time-sensitive. A prolonged halt could severely impact ALEX. We urgently need a fix for the most critical contracts, such as swaps. For ALEX, it's crucial to deploy a hotfix asap to avoid significant economic damage.

Full support from ALEX team.

Hero-Gamer commented 1 year ago

Just a reminder: if you are a CAB member reading (Technical CAB & Governance CAB), as per current SIP-000 guideline this needs CABs approval as it's consensus breaking, please go vote in your respective CAB discord DM group chat. Thank you!

wileyj commented 1 year ago

For information completeness anyone reading this SIP for the first time, I'd like to recommend inclusion of reference to the draft Catastrophic SIP guideline, similar to what SIP-022 doc referenced. And it gives everybody chance to read the Cata SIP to understand why it is used again this time.

https://github.com/stacksgov/sips/pull/131/files/c54d76709c4bbe9329d69109af10cc8386f584cb#r1181869902

aulneau commented 1 year ago

+1, shippit

cargocultscience commented 1 year ago

Thank you all for your work on this. Yes we should likely move forward with this fix. I would also humbly request that core-dev do a full feature freeze and spend however long (probably months) getting the testing coverage where it needs to be. These situations are bad, but they do serve as a guide post for where we are currently vs where we need to be from a testing perspective.

Hero-Gamer commented 1 year ago

https://github.com/stacksgov/sips/blob/main/considerations/economics.md This SIP might require Economics CAB review as well? I double checked their scope, seems very fitting.

What do you think @kantai and everyone else?

image
Jamil commented 1 year ago

@Hero-Gamer imo from looking at that point, that seems to be more relevant to discussions where there are changes in the way the protocol works — there is no change to the fundamental economics of the protocol here, just a fix to go to expected, previously-accepted behavior

Hero-Gamer commented 1 year ago

@Hero-Gamer imo from looking at that point, that seems to be more relevant to discussions where there are changes in the way the protocol works — there is no change to the fundamental economics of the protocol here, just a fix to go to expected, previously-accepted behavior

Good point.

wileyj commented 1 year ago

Thank you all for your work on this. Yes we should likely move forward with this fix. I would also humbly request that core-dev do a full feature freeze and spend however long (probably months) getting the testing coverage where it needs to be. These situations are bad, but they do serve as a guide post for where we are currently vs where we need to be from a testing perspective.

100% - we'll be addressing this soon afterwards and will communicate with the community about this

xyzerobtc commented 1 year ago

We fully support the fix at ZeroAuthority DAO and it's crucial to deploy a fix asap to avoid significant damage.

wileyj commented 1 year ago

For information completeness anyone reading this SIP for the first time, I'd like to recommend inclusion of reference to the draft Catastrophic SIP guideline, similar to what SIP-022 doc referenced. And it gives everybody chance to read the Cata SIP to understand why it is used again this time.

I think it's sufficient to link back to the SIP where this bug was introduced, which used that draft SIP as justification

Hero-Gamer commented 1 year ago

@Hero-Gamer imo from looking at that point, that seems to be more relevant to discussions where there are changes in the way the protocol works — there is no change to the fundamental economics of the protocol here, just a fix to go to expected, previously-accepted behavior

Just had a quick long conversation in Econ CAB's chat regarding the relevance of Econ CAB in this matter.

Some questions from chairperson @mattyTokenomics , then replies from Tech CAB to answer his questions. He might be jumping onto a long flight so he may not be able to reply.

For urgency I just wanna note that so far, he thinks "seems like not relevant to economics CAB" based on the back & forth.

I will let him reply officially once he has access to Github.

MarvinJanssen commented 1 year ago

For a workaround, did anyone try to deploy a contract with hardcoded principals (static dispatch) instead of trait references?

wileyj commented 1 year ago

For a workaround, did anyone try to deploy a contract with hardcoded principals (static dispatch) instead of trait references?

I think some workarounds were attempted first; i've asked the SIP authors to add to the related work section as defined in sip-000

jcnelson commented 1 year ago

Github continues to be a dumpster fire when it comes to doing anything non-trivial with git, including allowing me to open a PR against hirosystems/sips because jcnelson/sips already exists. So in lieu of doing that, I'm just going to dump my patch to this PR here which includes my edits. Please apply it.

diff --git a/sips/sip-023/sip-023-emergency-fix-traits.md b/sips/sip-023/sip-023-emergency-fix-traits.md
index 303c17e..8a783f6 100644
--- a/sips/sip-023/sip-023-emergency-fix-traits.md
+++ b/sips/sip-023/sip-023-emergency-fix-traits.md
@@ -25,23 +25,29 @@ Discussions-To: https://github.com/stacksgov/sips

 # Abstract

-On 1 May 2023, it was discovered that pre-2.1 contracts exposing public methods with
+On 1 May 2023, it was discovered that smart contracts deployed prior to Stacks 2.1
+that exposed public methods with
 trait arguments could not be invoked with previously working trait-implementing
 contract arguments.

 This bug was caused by the activation of Stacks Epoch 2.2 (https://github.com/stacksgov/sips/blob/main/sips/sip-022/sip-022-emergency-pox-fix.md).

 This SIP proposes an **immediate consensus-breaking change** to
-introduce a new Stacks Epoch 2.3 that corrects this regression.
+introduce a new Stacks epoch 2.3 that corrects this regression.

 **This SIP proposes a Bitcoin activation height of 788,287**

-# Epoch 2.2 Bug Behavior
+# Introduction

-Stacks Epoch 2.1 introduced a new type checker and type system which
+Clarity 2, introduced in Stacks 2.1, includes a new type checker and type system which
 impacts trait invocations. In order for existing contracts to remain
-compatible, their types must be _canonicalized_. The canonicalization
-method performed an exact check for the current epoch:
+compatible, their types must be _canonicalized_. In the context of traits,
+the type canonicalization rules implement the new trait semantics introduced in
+[SIP-015](./sips/sip-015/sip-015-network-upgrade.md).
+
+## Epoch 2.2 Bug Behavior
+
+The type canonicalization method performed an exact check for the current epoch:

```rust
     pub fn canonicalize(&self, epoch: &StacksEpochId) -> TypeSignature {
@@ -52,19 +58,44 @@ method performed an exact check for the current epoch:
     }

-Therefore, a pre-2.1 function with trait arguments that is invoked in Epoch 2.2 +Therefore, a pre-2.1 function with trait arguments that is invoked in Stacks 2.2 will fail to canonicalize its trait arguments, and abort with a -runtime error. Specifically, if a miner includes a contract call transaction in a block, it will be rejected by the mempool -with a type error, and a read-only call will fail with a runtime error. +runtime analysis error. Specifically: + + If a miner includes a contract call transaction with trait arguments in a block, the transaction will abort with a runtime error. + + If a user submits a contract call transaction with trait arguments to the

-The Epoch 2.3 hard fork will do the following: +This hard fork will do the following: + +* In epoch 2.2, the current buggy behavior will be preserved. All

-* Update the canonicalize() method to match on all epochs, setting

muneeb-ali commented 1 year ago

Thank you all for your work on this. Yes we should likely move forward with this fix. I would also humbly request that core-dev do a full feature freeze and spend however long (probably months) getting the testing coverage where it needs to be. These situations are bad, but they do serve as a guide post for where we are currently vs where we need to be from a testing perspective.

100% - we'll be addressing this soon afterwards and will communicate with the community about this

+1 to this. I think once the PoX-3 upgrade goes through, then the devs and community can review the test coverage to improve it and also brainstorm potential optimizations to testing, getting more test engineers involved etc. Thanks everyone for the hard work!

wileyj commented 1 year ago

Please apply my diff and then I'll accept.

looking now

wileyj commented 1 year ago

Please apply my diff and then I'll accept.

applied - @kantai @obycode there is some text in here around the network version

jcnelson commented 1 year ago

LGTM now. But @kantai @obycode please review the diff as well before signing off!

whoabuddy commented 1 year ago

The governance CAB approves SIP-023 with all members in favor. (ref: https://github.com/stacksgov/sips/pull/132)

philiphacks commented 1 year ago

Approach looks good to me

Can we activate at an earlier block height? 788,287 is still about 360 blocks away (at the time of writing). That is approx 2.5 days. As mentioned by others in this thread, most ecosystem apps are down. As an example for Arkadiko, people have open loans which they can't keep healthy (if STX price drops, they become undercollateralised and can get liquidated). They cannot access their funds. A fix needs to happen ideally today.

kantai commented 1 year ago

LGTM now. But @kantai @obycode please review the diff as well before signing off!

This diff looks good to me -- I can't approve the PR, because I'm the PR author.

obycode commented 1 year ago

For a workaround, did anyone try to deploy a contract with hardcoded principals (static dispatch) instead of trait references?

A workaround with hard-coded principals does indeed work. Workarounds wrapping the traits with a new contract in 2.2 do not work.

jiga commented 1 year ago

SIP Editor Update 6/30/2023: RICE Details: Reach - 500,000, Impact - 3x, Confidence - 95%, Effort - 3 person-months. Final RICE Score: 475,000

Status: Approved

Full details on this SIP can be seen in detail on the SIP impact assessment sheet where RICE scoring was conducted and tracked. Additional comments also available here: https://docs.google.com/spreadsheets/d/1DsdUkZ99c6m7U57m7RnzSd0nni3vGdMIT80k-_zQUb0/edit?usp=sharing