We currently believe the actual compat problem with https://github.com/Agoric/agoric-sdk/issues/8826 , or at least the problem that we actually need to fix, occurs when old contracts that were bundled with an endo preceding #1712 is used with a new liveslots, bundled with an endo including #1712 . The problem is that the old endo is producing old guards using the old klass format.
Fortunately, #1745 anticipated part of the compat issue that #1712 caused, by introducing get*GuardPayload functions, to encapsulate the actual guard representation so that other code using these guards could mostly consume them through these functions. It anticipated that we would switch from old to new guard formats, but not that they would need to co-exist across a version slippage boundary, like liveslots vs contracts.
This PR leverages the abstraction provided by #1745 to provide this version slippage tolerance only by modifying these getter functions to be tolerant of both formats on input, and to produce the corresponding modern payload as a result.
The nested guard adaptation provided by this PR is limited, for two reasons
I think it is adequate to solve the actual problem
it would make the non-legacy success case slower, and this is the case that must not get significantly slower.
Security Considerations
Technically, tolerating the old LegacyAwaitArgGuardShape is an exploitable bug, in that a record that matches this shape is also a valid parameter pattern that should allow an argument that matches that pattern, i.e., a copyRecord argument that at least contains a klass: 'awaitArgGuard' property. However, to be exploitable, the victim code would have had to accidentally write such a copyRecord pattern with a klass property and use it in as an arg guard. The chances of this happening accidentally are vanishingly small. Nevertheless, once we no longer need to support code preceding #1712 , we should remove all this extra complexity, and thus plug this in-theory exploitable bug.
Unlike LegacyAwaitArgGuardShape, tolerating the other legacy guard shapes does not seem like a currently exploitable bug, because there is
not currently any context where either a methodGuard or a copyRecord would both be meaningful, or
not currently any context where either an interfaceGuard or a copyRecord would both be meaningful.
Scaling Considerations
I tried to code this so that the non-legacy success case was the fast path, and paid minimal cost for being adaptable to this legacy.
Documentation Considerations
Other than fixing a surprising vexing problem, I don't think developers need to be aware of this change at all. Since the current https://github.com/Agoric/agoric-sdk/issues/8826 problem is surprising, this PR relieves us of the need to explain that surprise.
Testing Considerations
This PR is staged on #2039 so that the changes this PR makes to test-legacy-guard-tolerance.js demonstrate the relevant differences in behavior caused by this PR.
Compatibility Considerations
The whole point of this PR is for new code that consumes guards via the get*GuardPayload functions to become compatible with both old and new code that produces what were valid guards at that time. In particular, so that old contracts that bundle old endo preceding #1712 can be linked with new liveslots bundling an endo that includes #1712 .
This PR does not attempt to tolerate version slippage in the other direction. That would be hard because by assumption it would not be able to change the behavior of guard consuming code.
Upgrade Considerations
The problem of https://github.com/Agoric/agoric-sdk/issues/8826 manifested during an upgrade. This PR attempts to fix the problem that we think we actually need to fix to enable such upgrades. However, this PR and its predecessor #2039 only contain unit tests for this tolerance. We won't know if it fixes https://github.com/Agoric/agoric-sdk/issues/8826 until new relevant integration tests which are separately in progress (attn @mhofman ).
At such a time that we decide we no longer need to support code preceding https://github.com/endojs/endo/pull/1712 or guard data generated by that code, all the adaptation complexity in getGuardPayloads.js should be deleted. It can stay a separate file, but a much simpler one.
[ ] Includes *BREAKING*: in the commit message with migration instructions for any breaking change.
Staged on https://github.com/endojs/endo/pull/2039
closes: https://github.com/Agoric/agoric-sdk/issues/8826 refs: #2039 #1712 https://github.com/endojs/endo/pull/1745
Description
We currently believe the actual compat problem with https://github.com/Agoric/agoric-sdk/issues/8826 , or at least the problem that we actually need to fix, occurs when old contracts that were bundled with an endo preceding #1712 is used with a new liveslots, bundled with an endo including #1712 . The problem is that the old endo is producing old guards using the old
klass
format.Fortunately, #1745 anticipated part of the compat issue that #1712 caused, by introducing
get*GuardPayload
functions, to encapsulate the actual guard representation so that other code using these guards could mostly consume them through these functions. It anticipated that we would switch from old to new guard formats, but not that they would need to co-exist across a version slippage boundary, like liveslots vs contracts.This PR leverages the abstraction provided by #1745 to provide this version slippage tolerance only by modifying these getter functions to be tolerant of both formats on input, and to produce the corresponding modern payload as a result.
The nested guard adaptation provided by this PR is limited, for two reasons
Security Considerations
Technically, tolerating the old
LegacyAwaitArgGuardShape
is an exploitable bug, in that a record that matches this shape is also a valid parameter pattern that should allow an argument that matches that pattern, i.e., a copyRecord argument that at least contains aklass: 'awaitArgGuard'
property. However, to be exploitable, the victim code would have had to accidentally write such a copyRecord pattern with aklass
property and use it in as an arg guard. The chances of this happening accidentally are vanishingly small. Nevertheless, once we no longer need to support code preceding #1712 , we should remove all this extra complexity, and thus plug this in-theory exploitable bug.Unlike
LegacyAwaitArgGuardShape
, tolerating the other legacy guard shapes does not seem like a currently exploitable bug, because there isScaling Considerations
I tried to code this so that the non-legacy success case was the fast path, and paid minimal cost for being adaptable to this legacy.
Documentation Considerations
Other than fixing a surprising vexing problem, I don't think developers need to be aware of this change at all. Since the current https://github.com/Agoric/agoric-sdk/issues/8826 problem is surprising, this PR relieves us of the need to explain that surprise.
Testing Considerations
This PR is staged on #2039 so that the changes this PR makes to
test-legacy-guard-tolerance.js
demonstrate the relevant differences in behavior caused by this PR.Compatibility Considerations
The whole point of this PR is for new code that consumes guards via the
get*GuardPayload
functions to become compatible with both old and new code that produces what were valid guards at that time. In particular, so that old contracts that bundle old endo preceding #1712 can be linked with new liveslots bundling an endo that includes #1712 .This PR does not attempt to tolerate version slippage in the other direction. That would be hard because by assumption it would not be able to change the behavior of guard consuming code.
Upgrade Considerations
The problem of https://github.com/Agoric/agoric-sdk/issues/8826 manifested during an upgrade. This PR attempts to fix the problem that we think we actually need to fix to enable such upgrades. However, this PR and its predecessor #2039 only contain unit tests for this tolerance. We won't know if it fixes https://github.com/Agoric/agoric-sdk/issues/8826 until new relevant integration tests which are separately in progress (attn @mhofman ).
@mhofman has since done this integration testing and verified that this PR does fix https://github.com/Agoric/agoric-sdk/issues/8826. The test upgrade succeeded.
At such a time that we decide we no longer need to support code preceding https://github.com/endojs/endo/pull/1712 or guard data generated by that code, all the adaptation complexity in
getGuardPayloads.js
should be deleted. It can stay a separate file, but a much simpler one.*BREAKING*:
in the commit message with migration instructions for any breaking change.NEWS.md
for user-facing changes.