Closed glennmoy closed 10 months ago
Merging #211 (9bb6921) into main (caddc0e) will increase coverage by
0.46%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## main #211 +/- ##
==========================================
+ Coverage 97.65% 98.12% +0.46%
==========================================
Files 12 12
Lines 640 639 -1
==========================================
+ Hits 625 627 +2
+ Misses 15 12 -3
Flag | Coverage Δ | |
---|---|---|
Ray.jl | 98.12% <100.00%> (+0.46%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Files | Coverage Δ | |
---|---|---|
src/function_manager.jl | 97.77% <100.00%> (+5.47%) |
:arrow_up: |
src/ray_julia_jll/common.jl | 95.30% <100.00%> (+0.19%) |
:arrow_up: |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
Failures definitely look related:
[ Info: Connecting function manager to GCS at 10.0.0.105:6379...
terminate called after throwing an instance of 'std::runtime_error'
what(): NotFound: Failed to find the key
signal (6): Aborted
in expression starting at /home/runner/work/Ray.jl/Ray.jl/test/ray_serializer.jl:1
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
raise at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
[ Info: Connecting function manager to GCS at 127.0.0.1:6379...
libc++abi: terminating due to uncaught exception of type std::runtime_error: NotFound: Failed to find the key
signal (6): Abort trap: 6
in expression starting at /Users/runner/work/Ray.jl/Ray.jl/test/ray_serializer.jl:81
__pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
Allocations: 44751822 (Pool: 44725417; Big: 26405); GC: 51
I can manage to reproduce this on my local machine:
┌ Debug: Skipping registering ownership for ObjectRef("00ffffffffffffffffffffffffffffffffffffff4300000013000000") as object has known owner
└ @ Ray ~/.julia/dev/Ray/src/object_ref.jl:128
libc++abi: terminating due to uncaught exception of type std::runtime_error: NotFound: Failed to find the key
[1307] signal (6): Abort trap: 6
in expression starting at /Users/cvogt/.julia/dev/Ray/test/runtests.jl:19
__pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
Allocations: 30846806 (Pool: 30821062; Big: 25744); GC: 44
ERROR: Package Ray errored during testing (received signal: 6)
The issue noticed above was just due to us failing to Disconnect
the JuliaGcsClient
used in the function_manager.jl
tests. I've added a destructor to the JuliaGcsClient
to automatically disconnect the client to avoid the SIGABRT failure. However, since users of JuliaGcsClient
should really be handling disconnecting themselves I added in a STDERR warning from C++ indicating that a connected JuliaGcsClient
failed to disconnect on their own. As Ray.jl users aren't exposed to this class mostly this is just a reminder for Ray.jl developers to be explicit about disconnecting.
The issue noticed above was just due to us failing to
Disconnect
theJuliaGcsClient
used in thefunction_manager.jl
tests. I've added a destructor to theJuliaGcsClient
to automatically disconnect the client to avoid the SIGABRT failure. However, since users ofJuliaGcsClient
should really be handling disconnecting themselves I added in a STDERR warning from C++ indicating that a connectedJuliaGcsClient
failed to disconnect on their own. As Ray.jl users aren't exposed to this class mostly this is just a reminder for Ray.jl developers to be explicit about disconnecting.
yeah when I removed the finalizer
from the inner constructor as per your suggestion above I should have done so by reverting https://github.com/beacon-biosignals/Ray.jl/pull/211/commits/6b98717caa778604f27918461dc30f6cc7b3d9bf as I missed adding back the Disconnect
Break off of https://github.com/beacon-biosignals/Ray.jl/pull/202
Closes #76
Two things worth calling out:
PythonGcsClient
theGcsClient
has no direct interface for specifying atimeout
. This is instead set throughRayConfig
parameters which can be overriden via environment variables.Connect
to unintiated GCS Servers issues a warning message before killing the driver process after a 1-min timeout.