WebOfTrust / keria

KERI Agent in the cloud
https://keria.readthedocs.io/en/latest/
Apache License 2.0
20 stars 31 forks source link

Keria crash when multisig group member joins event after escrow timeout #312

Open lenkan opened 1 month ago

lenkan commented 1 month ago

version: Current stable release (0.1.3 and 0.1.4-dev0)

Keria crashes when a multisig member tries to join a multisig event after the partial signing escrow has timed out.

Steps to reproduce

  1. Create multisig group with member A and member B, with signing threshold 2
  2. Member A creates registry
  3. Wait until escrow times out, default 3600s.
  4. Member B creates registry

Actual result

Keria crashes with an uncaught error

keria-1  | The Agency is loaded and waiting for requests...
keria-1  |   Agent: ECC938Wp_HZjxU8-qMJGuRbM651Ag4rQ2gtazX7RAX_e   Controller: EOznJiKar3REVwcoTStnEtyQfGcQZjqfGN6kpmeqqaUg
keria-1  |   Agent: EP1eFZqnSK1oqV61ItkQqDOWUX8FzU6aSaD9kODJYuAo   Controller: ELPphP_tZrZiBVshbC7a7w92pPQMo_bnz530KYy_y5Nc
keria-1  | Waiting for other signatures for EEK_4rgd_Gb0X87x0M4uQH_hjzoNl7xBCLTN7RhgX7Xu:0...
keria-1  | Waiting for other signatures for EEK_4rgd_Gb0X87x0M4uQH_hjzoNl7xBCLTN7RhgX7Xu:0...
keria-1  | We are the fully signed witnesser 0, sending to witnesses
keria-1  | Waiting for fully signed witness receipts for 0
keria-1  | Witness receipts complete, EEK_4rgd_Gb0X87x0M4uQH_hjzoNl7xBCLTN7RhgX7Xu confirmed.
keria-1  | Waiting for fully signed witness receipts for 0
keria-1  | Witness receipts complete, EEK_4rgd_Gb0X87x0M4uQH_hjzoNl7xBCLTN7RhgX7Xu confirmed.
keria-1  | Waiting for TEL registry vcp event mulisig anchoring event
keria-1  | Waiting for other signatures for EEK_4rgd_Gb0X87x0M4uQH_hjzoNl7xBCLTN7RhgX7Xu:1...
keria-1  | Waiting for TEL registry vcp event mulisig anchoring event
keria-1  | Traceback (most recent call last):
keria-1  |   File "/keripy/venv/bin/keria", line 8, in <module>
keria-1  |     sys.exit(main())
keria-1  |   File "/usr/local/var/keria/src/keria/app/cli/keria.py", line 31, in main
keria-1  |     raise ex
keria-1  |   File "/usr/local/var/keria/src/keria/app/cli/keria.py", line 25, in main
keria-1  |     doers = args.handler(args)
keria-1  |   File "/usr/local/var/keria/src/keria/app/cli/commands/start.py", line 20, in <lambda>
keria-1  |     parser.set_defaults(handler=lambda args: launch(args))
keria-1  |   File "/usr/local/var/keria/src/keria/app/cli/commands/start.py", line 74, in launch
keria-1  |     runAgent(name=args.name,
keria-1  |   File "/usr/local/var/keria/src/keria/app/cli/commands/start.py", line 107, in runAgent
keria-1  |     directing.runController(doers=doers, expire=expire)
keria-1  |   File "/keripy/src/keri/app/directing.py", line 665, in runController
keria-1  |     doist.do(doers=doers)
keria-1  |   File "/keripy/venv/lib/python3.10/site-packages/hio/base/doing.py", line 156, in do
keria-1  |     self.recur()  # increments .tyme runs recur context
keria-1  |   File "/keripy/venv/lib/python3.10/site-packages/hio/base/doing.py", line 275, in recur
keria-1  |     tock = dog.send(self.tyme)  # yielded tock == 0.0 means re-run asap
keria-1  |   File "/keripy/venv/lib/python3.10/site-packages/hio/base/doing.py", line 922, in do
keria-1  |     self.done = self.recur(tyme=tyme)  # equv of doist.recur
keria-1  |   File "/keripy/venv/lib/python3.10/site-packages/hio/base/doing.py", line 1026, in recur
keria-1  |     tock = dog.send(tyme)  # yielded tock == 0.0 means re-run asap
keria-1  |   File "/keripy/venv/lib/python3.10/site-packages/hio/base/doing.py", line 922, in do
keria-1  |     self.done = self.recur(tyme=tyme)  # equv of doist.recur
keria-1  |   File "/keripy/venv/lib/python3.10/site-packages/hio/base/doing.py", line 1026, in recur
keria-1  |     tock = dog.send(tyme)  # yielded tock == 0.0 means re-run asap
keria-1  |   File "/keripy/venv/lib/python3.10/site-packages/hio/base/doing.py", line 568, in do
keria-1  |     self.done = self.recur(tyme=tyme)
keria-1  |   File "/usr/local/var/keria/src/keria/app/agenting.py", line 687, in recur
keria-1  |     self.counselor.start(ghab=ghab, prefixer=prefixer, seqner=seqner, saider=saider)
keria-1  |   File "/keripy/src/keri/app/grouping.py", line 50, in start
keria-1  |     evt = ghab.makeOwnEvent(sn=seqner.sn, allowPartiallySigned=True)
keria-1  |   File "/keripy/src/keri/app/habbing.py", line 2052, in makeOwnEvent
keria-1  |     serder, sigs, couple = self.getOwnEvent(sn=sn,
keria-1  |   File "/keripy/src/keri/app/habbing.py", line 2025, in getOwnEvent
keria-1  |     raise kering.MissingEntryError("Missing event for pre={} at sn={}."
keria-1  | keri.kering.MissingEntryError: Missing event for pre=EEK_4rgd_Gb0X87x0M4uQH_hjzoNl7xBCLTN7RhgX7Xu at sn=1.
keria-1 exited with code 1

Expected result

Not sure. In the current dev release, it seems like the registry creation goes through.

Notes

It is a bit convoluted to test this since the escrow timeouts are not configurable yet. However, I have pushed a docker image to my docker hub account where the escrow timeouts are reduced to 10s.

I did this by search and replace in the keripy code:

FROM weboftrust/keria:0.1.4-dev0

# For keria v0.1.4-dev0
ENV KERI_DIR="/keripy/src/keri" 
# ENVl KERI_DIR="/keripy/venv/lib/python3.12/site-packages/keri" # For keria 0.2.0-devX

RUN sed -i'' -E "s/Timeout([A-Z][A-Z][A-Z]) = [0-9]*/Timeout\1 = 10/g" "$KERI_DIR/core/eventing.py" \
    && sed -i'' -E "s/Timeout([A-Z][A-Z][A-Z]) = [0-9]*/Timeout\1 = 10/g" "$KERI_DIR/vdr/verifying.py" \
    && sed -i'' -E "s/Timeout([A-Z][A-Z][A-Z]) = [0-9]*/Timeout\1 = 10/g" "$KERI_DIR/vdr/eventing.py"

RUN grep -E -r "^\s+Timeout([A-Z])+ = ([0-9])+" "$KERI_DIR"

You could also use an entrypoint script:

#!/bin/bash

KERI_DIR="/keripy/src/keri" # For keria v0.1.4-dev0
# KERI_DIR="/keripy/venv/lib/python3.12/site-packages/keri" # For keria 0.2.0-devX

sed -i'' -E "s/Timeout([A-Z][A-Z][A-Z]) = [0-9]*/Timeout\1 = 10/g" "$KERI_DIR/core/eventing.py"
sed -i'' -E "s/Timeout([A-Z][A-Z][A-Z]) = [0-9]*/Timeout\1 = 10/g" "$KERI_DIR/vdr/verifying.py"
sed -i'' -E "s/Timeout([A-Z][A-Z][A-Z]) = [0-9]*/Timeout\1 = 10/g" "$KERI_DIR/vdr/eventing.py"

grep -E -r "^\s+Timeout([A-Z])+ = ([0-9])+" "$KERI_DIR"

keria start --config-file keria --name agent

Automated reproduction:

git clone git@github.com:nordlei/vlei-sandbox.git
cd vlei-sandbox
git checkout multisig-join-timeout-crash
docker compose build
docker compose run --rm test src/issues/multisig-join-after-timeout-crash.test.ts