Closed etschelp closed 3 years ago
I'll look into this
OK there's an issue going on but I don't think its related to web hooks.
(As an aside there is a potential to erase the webhook api key, but it's related to the admin config endpoint, not message re-tries. See PR: https://github.com/hyperledger/aries-cloudagent-python/pull/1420 So ... I don't think this is what's causing the BPA issue.)
When I added some debugging and ran a test (connect BPA1 and BPA2 via a connection invitation) it looks like the authentication is failing on the acapy-to-acapy messages, not the webhook messages. See the following log - http://localhost:8040
is the agent and http://localhost:8080
is the web hook:
bpa-agent1_1 | >>>>> Queue webhook for: http://bpa1:8080/log#changeme
bpa-agent1_1 | >>> queue endpoint/key: http://bpa1:8080/log/topic/connections/ changeme
bpa-agent1_1 | >>> completed endpoint/key: http://bpa1:8080/log/topic/connections/ changeme
bpa1_1 | 22:16:36.623 [io-executor-thread-20] INFO AriesWebhookController - Webhook received, type: connections
bpa1_1 | 22:16:36.624 [io-executor-thread-20] TRACE AriesEventHandler - {"invitation_msg_id": "247f65d8-0947-4a2c-9c4e-4d4261efecba", "invitation_mode": "once", "rfc23_state": "request-sent", "updated_at": "2021-09-20 22:16:36.591441Z", "their_label": "Business Partner Agent 2", "created_at": "2021-09-20 22:16:36.549361Z", "accept": "auto", "my_did": "SeMx6RFs35QMJy58EyTZFq", "state": "request", "request_id": "f8cb57da-8b0e-4e41-a336-528099b25173", "routing_state": "none", "alias": "Business Partner Agent 2", "invitation_key": "23UAGCZWPoAU6C8qwpVPLNm4LcHfXbi5FQAoibv3rsyX", "their_role": "inviter", "connection_protocol": "connections/1.0", "connection_id": "5f29f00c-bb5b-4530-a207-eb9d982188e3"}
bpa1_1 | 22:16:36.624 [io-executor-thread-20] DEBUG AriesEventHandler - Connection Event: ConnectionRecord(accept=AUTO, alias=Business Partner Agent 2, connectionId=5f29f00c-bb5b-4530-a207-eb9d982188e3, connectionProtocol=CONNECTION_V1, createdAt=2021-09-20 22:16:36.549361Z, errorMsg=null, inboundConnectionId=null, invitationKey=23UAGCZWPoAU6C8qwpVPLNm4LcHfXbi5FQAoibv3rsyX, invitationMode=ONCE, invitationMsgId=247f65d8-0947-4a2c-9c4e-4d4261efecba, myDid=SeMx6RFs35QMJy58EyTZFq, requestId=f8cb57da-8b0e-4e41-a336-528099b25173, rfc23Sate=null, routingState=NONE, state=REQUEST, theirDid=null, theirLabel=Business Partner Agent 2, theirPublicDid=null, theirRole=INVITER, updatedAt=2021-09-20 22:16:36.591441Z)
bpa-agent1_1 | >>> completed endpoint/key: http://bpa1:8080/log/topic/connections/ changeme
bpa-agent1_1 | 2021-09-20 22:16:36,660 aiohttp.access INFO 172.18.0.7 [20/Sep/2021:22:16:36 +0000] "POST /connections/receive-invitation?alias=Business%20Partner%20Agent%202&auto_accept=true HTTP/1.1" 200 796 "-" "okhttp/4.9.1"
bpa-agent1_1 | >>> queue endpoint/key: http://localhost:8040 None
bpa-agent1_1 | >>> retry endpoint/key: http://localhost:8040 None
bpa-agent1_1 | 2021-09-20 22:16:36,667 aries_cloudagent.transport.outbound.manager ERROR >>> Error when posting to: http://localhost:8040; Error: (<class 'aiohttp.client_exceptions.ClientConnectorError'>, ClientConnectorError(ConnectionKey(host='localhost', port=8040, is_ssl=False, ssl=None, proxy=None, proxy_auth=None, proxy_headers_hash=None), ConnectionRefusedError(111, "Connect call failed ('127.0.0.1', 8040)")), <traceback object at 0x7f380a456788>); Re-queue failed message ...
bpa-agent2_1 | 2021-09-20 22:16:41,857 aiohttp.access INFO 172.18.0.6 [20/Sep/2021:22:16:41 +0000] "GET /status/ready HTTP/1.1" 200 178 "-" "okhttp/4.9.1"
bpa-agent1_1 | 2021-09-20 22:16:42,341 aiohttp.access INFO 172.18.0.7 [20/Sep/2021:22:16:42 +0000] "GET /status/ready HTTP/1.1" 200 178 "-" "okhttp/4.9.1"
bpa-agent1_1 | >>> queue endpoint/key: http://localhost:8040 None
bpa-agent1_1 | 2021-09-20 22:16:46,652 aries_cloudagent.transport.outbound.manager ERROR >>> Error when posting to: http://localhost:8040; Error: (<class 'aiohttp.client_exceptions.ClientConnectorError'>, ClientConnectorError(ConnectionKey(host='localhost', port=8040, is_ssl=False, ssl=None, proxy=None, proxy_auth=None, proxy_headers_hash=None), ConnectionRefusedError(111, "Connect call failed ('127.0.0.1', 8040)")), <traceback object at 0x7f380a456f88>); Re-queue failed message ...
bpa-agent1_1 | >>> retry endpoint/key: http://localhost:8040 None
bpa-agent1_1 | >>> queue endpoint/key: http://localhost:8040 None
bpa-agent1_1 | 2021-09-20 22:16:56,687 aries_cloudagent.transport.outbound.manager ERROR >>> Error when posting to: http://localhost:8040; Error: (<class 'aiohttp.client_exceptions.ClientConnectorError'>, ClientConnectorError(ConnectionKey(host='localhost', port=8040, is_ssl=False, ssl=None, proxy=None, proxy_auth=None, proxy_headers_hash=None), ConnectionRefusedError(111, "Connect call failed ('127.0.0.1', 8040)")), <traceback object at 0x7f380a436148>); Re-queue failed message ...
bpa-agent1_1 | >>> retry endpoint/key: http://localhost:8040 None
bpa-agent1_1 | >>> queue endpoint/key: http://localhost:8040 None
bpa-agent1_1 | >>> retry endpoint/key: http://localhost:8040 None
bpa-agent1_1 | 2021-09-20 22:17:06,719 aries_cloudagent.transport.outbound.manager ERROR >>> Error when posting to: http://localhost:8040; Error: (<class 'aiohttp.client_exceptions.ClientConnectorError'>, ClientConnectorError(ConnectionKey(host='localhost', port=8040, is_ssl=False, ssl=None, proxy=None, proxy_auth=None, proxy_headers_hash=None), ConnectionRefusedError(111, "Connect call failed ('127.0.0.1', 8040)")), <traceback object at 0x7f380a436988>); Re-queue failed message ...
bpa-agent1_1 | >>> queue endpoint/key: http://localhost:8040 None
bpa-agent1_1 | 2021-09-20 22:17:16,714 aries_cloudagent.transport.outbound.manager ERROR >>> Outbound message failed to deliver, NOT Re-queued.
bpa-agent1_1 | Traceback (most recent call last):
It looks like (in the logs above) the agents are advertising their endpoints as 127.0.0.1
rather than their "public facing" service names
... when I fixed the aca-py agent endpoints, everything seems to work fine:
ACAPY_ENDPOINT=http://bpa-agent1:8030
ACAPY2_ENDPOINT=http://bpa-agent2:8040
I do not think this is what is causing the issue. I have the first aca-py connected to a BPA in my IDE like BPA_WEBHOOK_URL=http://192.168.1.5:8080/log
also I'm running with security enabled BPA_SECURITY_ENABLED=true
.
I can produce this behaviour by simply shutting down the BPA in the IDE. As soon as there is a connection between two agents there is a ping job running, so another agent is constantly sending messages that aca-py tries to send via the webhook. This fails now with a connection error:
bpa-agent1_1 | 2021-09-21 09:55:04,294 aries_cloudagent.transport.outbound.http DEBUG Posting to http://192.168.1.5:8080/log/topic/ping/; Data: {"comment": "c0d9712f-7d0e-4f68-a7e5-eea21b615650", "connection_id": null, "responded": true, "state": "received", "thread_id": "1a458355-0075-44d1-8f12-d913bb3b9d1e"}; Headers: {'x-api-key': 'changeme', 'Content-Type': 'application/json'}
Here the header is still set, but as soon as the BPA is started again the header is gone. So after restart the same logging statement:
bpa-agent1_1 | 2021-09-21 09:56:38,404 aries_cloudagent.transport.outbound.http DEBUG Posting to http://192.168.1.5:8080/log/topic/ping/; Data: {"comment": "c0d9712f-7d0e-4f68-a7e5-eea21b615650", "connection_id": null, "responded": true, "state": "received", "thread_id": "6e47002e-8cfc-4abf-a495-86e8f8380e5d"}; Headers: {'Content-Type': 'application/json'}
Without scrolling, before
Headers: {'x-api-key': 'changeme', 'Content-Type': 'application/json'}
After:
Headers: {'Content-Type': 'application/json'}
@etschelp thanks for the info.
Can you upload the aca-py log for this test (with --log-level debug
)?
This is the issue:
bpa-agent1_1 | 2021-09-21 14:09:33,426 aiohttp.access INFO 172.21.0.1 [21/Sep/2021:14:09:33 +0000] "GET /status/config HTTP/1.1" 200 4940 "-" "okhttp/4.9.1"
bpa-agent1_1 | 2021-09-21 14:09:33,431 aries_cloudagent.admin.server DEBUG Incoming request: GET /status/config
See my comment above re PR # 1420
Thanks a lot for figuring that out. This makes a lot of sense, as the BPA loads the admin config directly after startup hence erasing the api key.
Fixed with version > 0.7.2-rc0
I have aca-py deployed with the api key set in the webhook url, for example
--webhook-url http://localhost:8080/something#mykey
. This causes aca-py to sent an x-api-key header when calling the webhook. Since 0.7.0 this still works but as soon as the controller becomes unavailable for a short time e.g. restart, or is not available when aca-py starts up the header is not set any more. To be more specific the first call on the webhook after the restart still has the header, the second hasn't.I'm not an aca-py expert and maybe this is the wrong place but manager.py overwrites the endpoint variable with the result of the splitting action. If now for some reason the handler gets recreated it would continue without the api key.