bazelbuild / reclient

Apache License 2.0
55 stars 11 forks source link

transport: Error while dialing dial unix /out/.temp/reproxy_789.sock: connect: connection refused #42

Open DarkMatterV opened 2 months ago

DarkMatterV commented 2 months ago

Hi, I'm using AOSP RBE 0.57.0.4865132 and buildfarm 2.4.0 to build. And the number of concurrent actions is 500.

Lately, Occasional failures occur, it‘s log like:

F0410 19:58:46.227496  552492 main.go:143] Command failed: rpc error: code = Unavailable desc = retry budget exhausted (10 attempts): connection error: desc = "transport: Error while dialing dial unix /out/.temp/reproxy_789.sock: connect: connection refused"
goroutine 1 [running]:
github.com/golang/glog.stacks(0x0)
    external/com_github_golang_glog/glog.go:769 +0x8a
github.com/golang/glog.(*loggingT).output(0xc296a0, 0x3, 0xc00034e1c0, {0x9bc8da, 0xc00040df20}, 0x1, 0x0)
    external/com_github_golang_glog/glog.go:720 +0x46e
--
    external/com_github_golang_glog/glog.go:1148
main.main()
    cmd/rewrapper/main.go:143 +0x775

This seems to be caused by rewrapper failing to connect to reproxy, but I'm not sure

And I noticed that rewrapper and reproxy interact through the rpc protocol, on the same machine, whether this is a waste of resources (such as occupying socks port, etc.) or not stable enough. rewrapper is used to encapsulate commands and then pass them to reproxy, Is it considered to switch to the goroutine+channel mode to implement the reproxy function when the reproxy executes tasks and interacts with the server?

Someone who can help me?

gkousik commented 1 month ago

Hi - rewrapper and reproxy communite through UDS (Unix Domain Sockets) on Linux. The communication happens through a special device file on disk.

Android builds by default start reproxy when used with USE_RBE=true flag. I'm not fully ware of how your Android setup looks like but seems like atleast your socket address (set through RBE_server_address) might be incorrect since /out/.temp/reproxy_789.sock is unlikely to be the correct absolute path (I'm assuming the output directory is NOT at / and is rather within your source directory).

rewrapper is used to encapsulate commands and then pass them to reproxy, Is it considered to switch to the goroutine+channel mode to implement the reproxy function when the reproxy executes tasks and interacts with the server?

With rewrapper and reproxy, each build action is its own separate process - so there are multiple rewrapper process thaat are triggered by the Android build system, that talk to a single reproxy process (where as channels are used for communication between goroutines in a single process).