Closed mbauman closed 4 years ago
I wonder whether this might be something specific to the buildbot environment; running the Distributed tests repeatedly on my FreeBSD 11.3 machine with current Julia master, they consistently pass.
Error text was printed a bit earlier:
From worker 9: ERROR: LoadError: TaskFailedException:
From worker 9: Timed out waiting to read host:port string from worker.
From worker 9: Stacktrace:
From worker 9: [1] worker_from_id(::Distributed.ProcessGroup, ::Int64) at /usr/home/julia/buildbot/worker/package_freebsd64/build/usr/share/julia/stdlib/v1.4/Distributed/src/cluster.jl:1059
From worker 9: [2] worker_from_id at /usr/home/julia/buildbot/worker/package_freebsd64/build/usr/share/julia/stdlib/v1.4/Distributed/src/cluster.jl:1056 [inlined]
From worker 9: [3] #remote_do#156 at /usr/home/julia/buildbot/worker/package_freebsd64/build/usr/share/julia/stdlib/v1.4/Distributed/src/remotecall.jl:482 [inlined]
From worker 9: [4] remote_do at /usr/home/julia/buildbot/worker/package_freebsd64/build/usr/share/julia/stdlib/v1.4/Distributed/src/remotecall.jl:482 [inlined]
From worker 9: [5] kill at /usr/home/julia/buildbot/worker/package_freebsd64/build/usr/share/julia/stdlib/v1.4/Distributed/src/managers.jl:534 [inlined]
From worker 9: [6] create_worker(::Distributed.LocalManager, ::WorkerConfig) at /usr/home/julia/buildbot/worker/package_freebsd64/build/usr/share/julia/stdlib/v1.4/Distributed/src/cluster.jl:581
From worker 9: [7] setup_launched_worker(::Distributed.LocalManager, ::WorkerConfig, ::Array{Int64,1}) at /usr/home/julia/buildbot/worker/package_freebsd64/build/usr/share/julia/stdlib/v1.4/Distributed/src/cluster.jl:523
From worker 9: [8] (::Distributed.var"##43#46"{Distributed.LocalManager,Array{Int64,1},WorkerConfig})() at ./task.jl:333
From worker 9: Stacktrace:
From worker 9: [1] sync_end(::Array{Any,1}) at ./task.jl:300
From worker 9: [2] macro expansion at ./task.jl:319 [inlined]
From worker 9: [3] #addprocs_locked#40(::Base.Iterators.Pairs{Symbol,Any,Tuple{Symbol,Symbol},NamedTuple{(:exename, :exeflags),Tuple{String,Cmd}}}, ::typeof(Distributed.addprocs_locked), ::Distributed.LocalManager) at /usr/home/julia/buildbot/worker/package_freebsd64/build/usr/share/julia/stdlib/v1.4/Distributed/src/cluster.jl:477
From worker 9: [4] #addprocs_locked at ./none:0 [inlined]
From worker 9: [5] #addprocs#39(::Base.Iterators.Pairs{Symbol,Any,Tuple{Symbol,Symbol},NamedTuple{(:exename, :exeflags),Tuple{String,Cmd}}}, ::typeof(addprocs), ::Distributed.LocalManager) at /usr/home/julia/buildbot/worker/package_freebsd64/build/usr/share/julia/stdlib/v1.4/Distributed/src/cluster.jl:441
From worker 9: [6] #addprocs at ./none:0 [inlined]
From worker 9: [7] #addprocs#247 at /usr/home/julia/buildbot/worker/package_freebsd64/build/usr/share/julia/stdlib/v1.4/Distributed/src/managers.jl:316 [inlined]
From worker 9: [8] #addprocs at ./none:0 [inlined]
From worker 9: [9] #addprocs_with_testenv#4 at /usr/home/julia/buildbot/worker-tabularasa/tester_freebsd64/build/share/julia/test/testenv.jl:29 [inlined]
From worker 9: [10] addprocs_with_testenv(::Int64) at /usr/home/julia/buildbot/worker-tabularasa/tester_freebsd64/build/share/julia/test/testenv.jl:29
From worker 9: [11] top-level scope at /usr/home/julia/buildbot/worker-tabularasa/tester_freebsd64/build/share/julia/stdlib/v1.4/Distributed/test/distributed_exec.jl:1079
From worker 9: [12] include at ./boot.jl:328 [inlined]
From worker 9: [13] include_relative(::Module, ::String) at ./loading.jl:1105
From worker 9: [14] include(::Module, ::String) at ./Base.jl:31
From worker 9: [15] exec_options(::Base.JLOptions) at ./client.jl:295
From worker 9: [16] _start() at ./client.jl:468
From worker 9: in expression starting at /usr/home/julia/buildbot/worker-tabularasa/tester_freebsd64/build/share/julia/stdlib/v1.4/Distributed/test/distributed_exec.jl:1078
I don't think we have an issue for this, and it seems to be consistently happening these days. E.g., https://build.julialang.org/#/builders/29/builds/4925/steps/2/logs/stdio: