Closed loneil closed 5 months ago
I tried pulling from ACA-Py 0.11.0 tag (to see if something was recently introduced) and get the same error too
Thinking this is probably windows related. I've been using the demo all the time without any issues. I'm in a linux and devcontainer environment.
Maybe try on http://test.bcovrin.vonx.io/ and see if it replicates. Might be local networking coming back to the agent.
Yeah the demo starts up fine with bcovrin test, it's the VON part that I was wondering about.
It's working for me on VON if I try in WSL instead of Windows on git bash so (unless it's something specific to my setup, but everything else in the ecosystem I use seems to work fine) maybe just worth changing the instructions here to recommend WSL https://aca-py.org/latest/demo/#running-in-docker
@loneil, I'd like to review this with you. It should work just fine on Windows. I find WSL introduces a myriad of other issues for Windows users. So it's best to figure out what is happening here.
It looks like it might be a networking issue. If you start the demo without von-network running it fails immediately indicating it can't connect to host.docker.internal:9000
(von-network's IP and port) as expected.
Based on the output of your logs, von-network
is using an explicit IP address in it's genesis file. When running on Windows I'd expect that to be listed as host.docker.internal
. At some point did you start von-network
using the command ./manager start <IP_Address>
?
When I start von-network
and run the demo, it runs until it sits waiting for a connection after outputting a QR code.
On Windows and MAC you cannot access the docker host PI address directly, it needs to be accessed through host.docker.internal
, therefore on those platforms the docker host address gets resolved to host.docker.internal
. The Linux version of docker still has not caught up with the same convention (I believe), and therefore the docker host gets resolved differently and is resolved to the IP address of the docker host rather than host.docker.internal
(which does not exist in that platform). These differences can explain why it seems to work in Linux and not on Windows/MAC in may cases. I have a feeling these networking nuances are what's causing issues here.
@loneil, Try resetting your von-network
instance. ./manage rm
, ./manage start
, and then run the demos.
@WadeBarnes yeah I've done remove/start, and delete all docker images/volumes and restarted etc and always get the same error starting the demo. Just starting VON with /manage start --logs
On startup of the Alice agent demo, I do see (in the VON logs):
webserver-1 | INFO:aiohttp.access:172.24.0.1 [10/May/2024:16:14:08 +0000] "GET /genesis HTTP/1.1" 200 3244 "-" "Python/3.9 aiohttp/3.9.5"
On startup of Faber I see that above as well as
webserver-1 | INFO:aiohttp.access:172.24.0.1 [10/May/2024:16:14:48 +0000] "POST /register HTTP/1.1" 200 332 "-" "Python/3.9 aiohttp/3.9.5"
and as in the screenshot above, I do see transcations in the ledger browser from the Faber agent startups
So it looks like it must be able to make requests in some way in this networking setup?
But then it still runs into:
Faber | indy_vdr.error.VdrError: Pool timeout: Request was interrupted
.
.
.
Exception: Timed out waiting for agent process to start (status=None). Admin URL: http://host.docker.internal:8021/status
If I do not start up the VON network at all then I get the obvious Error retrieving ledger genesis transactions
instead on demo startup.
Worked through with Wade, looks like I had been keeping around settings from starting up VON on WSL.
Can see in a WSL startup that the genesis url has node_ips with the resolved IP addresses. In Windows startup, need those node_ips to be host.docker.internal.
Looks like I hadn't been properly pruning volumes when trying a blank slate start so the VON one must have been hanging around. Going and doing manage rm
and then starting up VON (on windows first or else it will get the WSL node IPs) does solve this.
So for troubleshooting this case if anyone else comes across it, a first place to look is to
node_ip
values in thereAnything worth putting in a document in the repo?
Anything worth putting in a document in the repo?
This may be an esoteric case? (opinions on this?) I run VON network for various thing (usually endorser service) and had generally used WSL, so the genesis file had the IP settings needed for WSL, then starting things up in Windows still will use that genesis and that's where the problem is. The main issue for me at least was I was never properly checking that I was pruning the volumes in Docker (thought I was), so someone with an actual fresh start on Windows probably would not hit this.
Maybe worth nothing this specific troubleshooting case (check genesis files node_ip format) somewhere but maybe it's a bit in the weeds to have in the demo instructions themselves. Not sure if VON Network or ACA-Py repository would be the best place to note that (unless it's already noted somewhere).
I'm not sure if this is a Windows-specific thing (see system notes below), but I'm unable to run the demo at the Faber step any more (fairly sure I've done this on the same system in the past).
This is following the steps from the documentation here: https://aca-py.org/latest/demo/#running-in-docker
Steps to Reproduce
main
./run_demo faber
Get error "indy_vdr.error.VdrError: Pool timeout: Request was interrupted" as below
However I can see (in the ledger browser) Faber agent related transactions making it there:
System details Windows 11 23H2 22631.3447 Git Bash shell Docker version 24.0.7, build afdd53b Docker Compose version v2.23.3-desktop.2