WebOfTrust / keripy

Key Event Receipt Infrastructure - the spec and implementation of the KERI protocol
https://keripy.readthedocs.io/en/latest/
Apache License 2.0
54 stars 53 forks source link

Communication with witnesses hangs if one witness is not responding #814

Open rodolfomiranda opened 1 month ago

rodolfomiranda commented 1 month ago

Version

all

Environment

Witness Deployment

Expected behavior

When Receiptor sends events to the list of witnesses, it should try to communicate with all witnesses even if one of them is unresponsive.

Actual behavior

Current logic in Receiptor https://github.com/WebOfTrust/keripy/blob/6b6970be92d851f94e58b7a9f4225a81582d2235/src/keri/app/agenting.py#L90-L110 keeps in an infinite loop in line 96 if one witness is unresponsive, preventing the loop over witnesses (line 90) to iterate to the rest. Ideally the client should timeout, instead it tries forever, at least using demo witnesses. One option is to add a timeout in line 96.

Steps to reproduce

Create a single sig AID with toad=1 and two witnesses , with one of them disconnected.

Using kli incept with the --receipt-endpoint flag forces to use the Receiptor. It can reproduced with the following steps

1- start demo witness kli witness demo 2- init and resolve to witness oobis, one from the demo wits and one on the cloud

kli init --name local  --nopasscode
kli oobi resolve --name local --oobi http://127.0.0.1:5642/oobi/BBilc4-L3tFUnfM_wJr4S4OJanAv_VmF_dJNN6vkf2Ha/controller --oobi-alias witdemo
kli oobi resolve --name local --oobi http://witness1.dev.provenant.net:5631/oobi/BCf29L_7oQtU8WUXEV2Bi5sf7WoxnGyX7sgJSym-p4Pp/controller --oobi-alias wit cloud

3- test incepting with toad=1 and both witnesses up

kli incept --name local --alias aid1  -w BCf29L_7oQtU8WUXEV2Bi5sf7WoxnGyX7sgJSym-p4Pp -w BBilc4-L3tFUnfM_wJr4S4OJanAv_VmF_dJNN6vkf2Ha   --toad 1 --icount 1 --isith 1 --ncount 1 --nsith 1 --transferable --receipt-endpoint
kli status --name local --alias aid1
Alias:  aid1
Identifier: EC_vFyYXNLRTlXiTU8ON5m6g4QF1xQxkuejpShFiZbRT
Seq No: 0

Witnesses:
Count:      2
Receipts:   2
Threshold:  1

Public Keys:    
    1. DBELZgD9ZEn9wRwJ3XRxyFo6mFMqAb4poQ1fEDznXNuH

4- stop demo witnesses (ctrl-c) 5- incept a new aid

kli incept --name local --alias aid2  -w BBilc4-L3tFUnfM_wJr4S4OJanAv_VmF_dJNN6vkf2Ha   -w BCf29L_7oQtU8WUXEV2Bi5sf7WoxnGyX7sgJSym-p4Pp --toad 1 --icount 1 --isith 1 --ncount 1 --nsith 1 --transferable --receipt-endpoint
you need to ctrl-c to stop the loop
kli status --name local --alias aid2
Alias:  aid2
Identifier:EKWiV3XUA90-0UJLbRuqPNiomjbIu57imJqHhmJy1req
Seq No: 0

Witnesses:
Count:      2
Receipts:   0
Threshold:  1

Note that no receipts were received even though one witness is up and toad=1 6- incept a new aid with the witnesses in different order

kli incept --name local --alias aid3  -w BCf29L_7oQtU8WUXEV2Bi5sf7WoxnGyX7sgJSym-p4Pp -w BBilc4-L3tFUnfM_wJr4S4OJanAv_VmF_dJNN6vkf2Ha   --toad 1 --icount 1 --isith 1 --ncount 1 --nsith 1 --transferable --receipt-endpoint
you need to ctrl-c to stop the loop
kli status --name local --alias aid3
Alias:  aid3
Identifier:EENZlZ9DNhNFlga_BXIOv2NQ5KVbnxzow4aNGTupizin
Seq No: 0

Witnesses:
Count:      2
Receipts:   1
Threshold:  1

Note that now 1 receipt is received because the witness that is up is first in the loop.

rodolfomiranda commented 1 month ago

We may encounter the infinite loop also in:

Receiptor
WitnessReceiptor
WitnessInquisitor
WitnessPublisher
TCPMessanger
TCPStreamMessanger
HTTPMessager

One option may be to pass a timeout value to those clases to kill loops after a while.