penumbra-zone / penumbra

Penumbra is a fully private proof-of-stake network and decentralized exchange for the Cosmos ecosystem.
https://penumbra.zone
Apache License 2.0
377 stars 294 forks source link

WitnessAndBuild RPC fails with high memory usage #2977

Closed conorsch closed 1 year ago

conorsch commented 1 year ago

Describe the bug The WitnessAndBuild RPC is failing in interchaintest. Upon submitting a penumbra.view.v1alpha1.ViewProtocolService.WitnessAndBuild, pclientd balloons in memory consumption until it's OOM killed:

pclientd-witness-memleak

Let's figure out why that failure occurs.

To Reproduce Steps to reproduce the behavior:

  1. Clone https://github.com/strangelove-ventures/interchaintest/
  2. Check out andrew/penumbra_pclientd
  3. Run go test -v examples/penumbra/penumbra_chain_test.go
  4. See error

Expected behavior The witness and build RPC returns successfully.

conorsch commented 1 year ago

Debugging

Opened a draft PR in Penumbra for new container: https://github.com/penumbra-zone/penumbra/pull/2974 The container is built manually (locally on my machine) and pushed to the GHCR container registry: ghcr.io/penumbra-zone/penumbra:debug-witness-and-build. This is because the interchaintest setup requires a Strangelove base container.

Results of testing

The draft PR contains liberal debugging statements, to isolate where the logic is failing. We now know that it's while building the TransactionPlan, prior to calling .authorize() on the txp. Example logs from failure:

        VSep 07 00:05:16.661 DEBUG witness_and_build{request=Request { metadata: MetadataMap { headers: {"content-type": "application/grpc", "user-agent": "grpc-go/1.57.0", "te": "trailers"} }, message: WitnessAndBuildRequest { transaction_plan: Some(TransactionPlan { actions: [ActionPlan { action: Some(Output(OutputPlan { value: Some(Value { amount: Some(Amount { lo: 1000, hi: 0 }), asset_id: Some(AssetId { inner: [41, 234, 156, 47, 51, 113, 246, 164, 135, 231, 233, 92, 3
6, 112, 65, 244, 163, 86, 249, 131, 235, 6, 78, 93, 43, 59, 207, 50, 44, 169, 106, 16], alt_bech32m: "", alt_base_denom: "" }) }), dest_address: Some(Address { inner: [221, 146, 87, 221, 50, 124, 201, 32, 147, 124, 132, 194, 214, 54, 6, 196, 171, 161, 40, 232, 53, 226, 191, 188, 158, 69, 169, 59, 43, 132, 69, 58, 126, 162, 143, 82, 89, 233, 121, 250, 62, 7, 60, 91, 142, 126, 133, 198, 70, 251, 196, 148, 25, 8, 18, 242, 210, 83, 206, 131, 255, 59, 50, 110, 103, 229, 84, 151
, 90, 226, 165, 100, 105, 240, 87, 229, 0, 133, 169, 64], alt_bech32m: "" }), rseed: b"6\xda\xe6\xfb*\x1e\x1b^\xa3L\xdf\xb4\xde+\x99\xbd?L\x99\xcc\xcb\xac\xa5Q\x8362\x8fY\xb61\x1c", value_blinding: b"\x02\x1f\xaac\x80\xb5\xe1\xd0\xdc\x19\x92\x9dv\xd2\xc7\xe8\xb4HA\xf6Sl)7\x99\xb7tk\t}\x94\x02", proof_blinding_r: b"$ \xf8\xe2\xd7\x8d\xc9\xba\x8a\xe3\x94MxG\xfa\x08P\xe6\xdd-\xc6=+\x19\xf7\x82\x16\x9c\x10\xe9\xa0\x0c", proof_blinding_s: b"\xd6\xd1@\x83\x8c&\xd0Fu\xcfx\x03j\x0
4\x82\xaaV\"\"\xa05\x14-2\xb1\xf1l\xcbe\xf4{\x03" })) }, ActionPlan { action: Some(Spend(SpendPlan { note: Some(Note { value: Some(Value { amount: Some(Amount { lo: 1000000000000, hi: 0 }), asset_id: Some(AssetId { inner: [41, 234, 156, 47, 51, 113, 246, 164, 135, 231, 233, 92, 36, 112, 65, 244, 163, 86, 249, 131, 235, 6, 78, 93, 43, 59, 207, 50, 44, 169, 106, 16], alt_bech32m: "", alt_base_denom: "" }) }), rseed: [65, 168, 111, 212, 62, 36, 73, 164, 112, 46, 186, 198, 209
, 151, 251, 178, 63, 82, 198, 141, 42, 84, 205, 166, 120, 255, 126, 26, 99, 65, 193, 83], address: Some(Address { inner: [221, 146, 87, 221, 50, 124, 201, 32, 147, 124, 132, 194, 214, 54, 6, 196, 171, 161, 40, 232, 53, 226, 191, 188, 158, 69, 169, 59, 43, 132, 69, 58, 126, 162, 143, 82, 89, 233, 121, 250, 62, 7, 60, 91, 142, 126, 133, 198, 70, 251, 196, 148, 25, 8, 18, 242, 210, 83, 206, 131, 255, 59, 50, 110, 103, 229, 84, 151, 90, 226, 165, 100, 105, 240, 87, 229, 0, 133
, 169, 64], alt_bech32m: "" }) }), position: 5, randomizer: b" \xc0\x90\x14\xd2zy\x06n&\xcd)4\x9d\xd42\x8e\x8a'\x9d\xd5\xb2D\xac\xcc.\xd97\x92\xcb\xbd\x02", value_blinding: b"\x98\xf5\x1a\x03\xa5JQ\xf7\xba+\x03\x07\x1f\x13\x04\xe0\xc9\xb7\xd2\x06\xf3\x9f]\x85&X\x95%\xf7\xa8\xd6\x01", proof_blinding_r: b"\xa3\xabA99\xa6J\x1c\xe0\x05Np>Z\xf1\xbe\xd0\xdf,\x9c9\xcc\x1c\xfc\xaeBs\xa9\x8b\xda<\x0e", proof_blinding_s: b"\x07\xbe\xca\x15\xd1`\xb9\x15\xf5o\xf4\xd8\xdfEt\xd0\xe4\x95
_\x9a*>1\xf5lx\xb5/n\xc0\xdf\t" })) }, ActionPlan { action: Some(Output(OutputPlan { value: Some(Value { amount: Some(Amount { lo: 999999999000, hi: 0 }), asset_id: Some(AssetId { inner: [41, 234, 156, 47, 51, 113, 246, 164, 135, 231, 233, 92, 36, 112, 65, 244, 163, 86, 249, 131, 235, 6, 78, 93, 43, 59, 207, 50, 44, 169, 106, 16], alt_bech32m: "", alt_base_denom: "" }) }), dest_address: Some(Address { inner: [221, 146, 87, 221, 50, 124, 201, 32, 147, 124, 132, 194, 214, 54
, 6, 196, 171, 161, 40, 232, 53, 226, 191, 188, 158, 69, 169, 59, 43, 132, 69, 58, 126, 162, 143, 82, 89, 233, 121, 250, 62, 7, 60, 91, 142, 126, 133, 198, 70, 251, 196, 148, 25, 8, 18, 242, 210, 83, 206, 131, 255, 59, 50, 110, 103, 229, 84, 151, 90, 226, 165, 100, 105, 240, 87, 229, 0, 133, 169, 64], alt_bech32m: "" }), rseed: b"|Hh\x98\xc4~\x9b\x0e\xa3\xe3\x13/\xbe\xfd\x90$\"\xe6\x1aZ\x14\xc5\xbe!\xdcd\x98c`~V\xb2", value_blinding: b"\x93W\x8c/\xa4x\x13\x8c\x06\xfe\xa2$P
\x9b\x94q]\xc3\xf3\\\x13\xae\x12/\xac\xc6\xebP\x95f\xb6\x01", proof_blinding_r: b" \\\xf5O\xf3\xfd\0\0\xe9a'WL!8\xcf)<\xc7\x8a\x86>\xa8\xe1\xe6o\n\x0b0\x03\xe1\x0c", proof_blinding_s: b"\xe2n?\xba\xd6=\x01\xc7\xcd\x82L\xfd\xea\x16\xdb\x86~Ug\x93=k\x02\xd4U\x91<\xe4\xc9\xac\x10\r" })) }], expiry_height: 0, chain_id: "penumbra-1", fee: Some(Fee { amount: Some(Amount { lo: 0, hi: 0 }), asset_id: None }), clue_plans: [CluePlan { address: Some(Address { inner: [221, 146, 87, 22
1, 50, 124, 201, 32, 147, 124, 132, 194, 214, 54, 6, 196, 171, 161, 40, 232, 53, 226, 191, 188, 158, 69, 169, 59, 43, 132, 69, 58, 126, 162, 143, 82, 89, 233, 121, 250, 62, 7, 60, 91, 142, 126, 133, 198, 70, 251, 196, 148, 25, 8, 18, 242, 210, 83, 206, 131, 255, 59, 50, 110, 103, 229, 84, 151, 90, 226, 165, 100, 105, 240, 87, 229, 0, 133, 169, 64], alt_bech32m: "" }), rseed: b"a@\xd4\x9c\xe1\x98\x8aly\xc8\xf1_\xb5\x8aqO\xad8H-\x04\xbb\xad\x0c{?\xc8\0\xd7Z_\xb4", precision_
bits: 0 }, CluePlan { address: Some(Address { inner: [221, 146, 87, 221, 50, 124, 201, 32, 147, 124, 132, 194, 214, 54, 6, 196, 171, 161, 40, 232, 53, 226, 191, 188, 158, 69, 169, 59, 43, 132, 69, 58, 126, 162, 143, 82, 89, 233, 121, 250, 62, 7, 60, 91, 142, 126, 133, 198, 70, 251, 196, 148, 25, 8, 18, 242, 210, 83, 206, 131, 255, 59, 50, 110, 103, 229, 84, 151, 90, 226, 165, 100, 105, 240, 87, 229, 0, 133, 169, 64], alt_bech32m: "" }), rseed: b"\xc6\xac\xd3\x7f\xff\x08\x9
0\x1d\xd49R\xf5\x99\x93a7I\xa4\xbc\x95\xba\xd9v\x05\xfd\xf1.R\xed\r~\xdb", precision_bits: 0 }], memo_plan: Some(MemoPlan { plaintext: Some(MemoPlaintext { sender: Some(Address { inner: [103, 87, 131, 60, 143, 135, 239, 104, 90, 51, 68, 155, 20, 78, 233, 161, 57, 110, 250, 90, 1, 236, 15, 220, 123, 51, 99, 35, 191, 222, 72, 152, 16, 74, 237, 98, 91, 72, 243, 41, 73, 204, 132, 132, 80, 206, 25, 99, 189, 39, 80, 13, 116, 157, 197, 11, 165, 95, 56, 153, 203, 183, 30, 246, 88,
 37, 159, 185, 159, 252, 37, 150, 196, 98, 25, 66, 102, 160, 253, 116], alt_bech32m: "" }), text: "" }), key: b"Zm\x81\xa2\x90\x03\x0f\x9c?_\xc4\xb4\x0ba\x11\xc3,\xa1\xe6T\x89Q\xe3\r]1\xb0\xfb\xebC\xf4\xda" }) }), authorization_data: Some(AuthorizationData { effect_hash: Some(EffectHash { inner: [156, 212, 138, 40, 247, 174, 213, 180, 219, 193, 34, 165, 147, 5, 111, 10, 132, 106, 206, 142, 185, 157, 211, 176, 88, 224, 12, 150, 18, 142, 86, 210, 36, 18, 114, 16, 2, 157, 49,
 221, 97, 63, 65, 128, 140, 67, 85, 126, 178, 195, 209, 188, 206, 30, 18, 65, 232, 72, 31, 139, 2, 246, 49, 204] }), spend_auths: [SpendAuthSignature { inner: [212, 127, 225, 194, 209, 121, 5, 18, 207, 200, 244, 31, 226, 84, 94, 169, 1, 255, 84, 221, 248, 253, 191, 104, 77, 210, 9, 176, 213, 56, 159, 13, 109, 7, 195, 55, 15, 33, 164, 115, 107, 178, 166, 40, 206, 111, 155, 21, 74, 203, 252, 98, 47, 143, 129, 183, 223, 65, 125, 169, 43, 201, 52, 0] }], delegator_vote_auths: 
[] }) }, extensions: Extensions }}: penumbra_view::service: Building txp 
conorsch commented 1 year ago

The txp -> tx construction fails in this block:

https://github.com/penumbra-zone/penumbra/blob/39864c64fb7478ce255dd3e5a829c178933d06fb/crates/core/transaction/src/plan/build.rs#L46-L61

Specifically, in the final statement, where the spendplan is created. Will trace there next.

conorsch commented 1 year ago

The hang happens inside the ark_groth16::Groth16::create_proof_with_reduction:

https://github.com/penumbra-zone/penumbra/blob/39864c64fb7478ce255dd3e5a829c178933d06fb/crates/core/component/shielded-pool/src/spend/proof.rs#L252-L254

That crate isn't maintained by us. While I could use a local-path override to continue to add debug statements in the arkworks, I'm inclined instead to come up for air and check in about the approach.

conorsch commented 1 year ago

@hdevalence rightly remembered that r1cs=off was necessary to avoid ark-related kabooms in the past, see here: https://github.com/penumbra-zone/penumbra/blob/39864c64fb7478ce255dd3e5a829c178933d06fb/crates/bin/pcli/src/opt.rs#L60-L70

Re-running interchaintest with RUST_LOG=r1cs=off resolves the problem. We should hardcode that override in the tracing setup for pclientd, same as pcli.

conorsch commented 1 year ago

Considering this issue resolved, now that we have #2980 merged.