jamiebrynes7 / spatialos-sdk-rs

Rust integration of the SpatialOS C API bindings
Apache License 2.0
21 stars 6 forks source link

Connecting with the same worker id twice can segfault or stack overflow. #81

Closed jamiebrynes7 closed 5 years ago

jamiebrynes7 commented 5 years ago

Oddly this is not repro with 100% accuracy:

Jamie Brynes@Jamie-Desktop /c/Workspace/spatialos-sdk-rs/project-example (feature/complete-snapshot-api)
λ cargo run -- --worker-id RustWorker998 --worker-type RustWorker receptionist
    Finished dev [unoptimized + debuginfo] target(s) in 0.16s
     Running `C:\Workspace\spatialos-sdk-rs\target\debug\project-example.exe --worker-id RustWorker998 --worker-type RustWorker receptionist`
error: process didn't exit successfully: `C:\Workspace\spatialos-sdk-rs\target\debug\project-example.exe --worker-id RustWorker998 --worker-type RustWorker receptionist` (exit code: 0xc0000005, STATUS_ACCESS_VIOLATION)
Segmentation fault
Jamie Brynes@Jamie-Desktop /c/Workspace/spatialos-sdk-rs/project-example (feature/complete-snapshot-api)
λ cargo run -- --worker-id RustWorker998 --worker-type RustWorker receptionist
    Finished dev [unoptimized + debuginfo] target(s) in 0.16s
     Running `C:\Workspace\spatialos-sdk-rs\target\debug\project-example.exe --worker-id RustWorker998 --worker-type RustWorker receptionist`
memory allocation of 27086784456 bytes failederror: process didn't exit successfully: `C:\Workspace\spatialos-sdk-rs\target\debug\project-example.exe --worker-id RustWorker998 --worker-type RustWorker receptionist` (exit code: 0xc0000409, STATUS_STACK_BUFFER_OVERRUN)
Jamie Brynes@Jamie-Desktop /c/Workspace/spatialos-sdk-rs/project-example (feature/complete-snapshot-api)
λ cargo run -- --worker-id RustWorker998 --worker-type RustWorker receptionist
    Finished dev [unoptimized + debuginfo] target(s) in 0.15s
     Running `C:\Workspace\spatialos-sdk-rs\target\debug\project-example.exe --worker-id RustWorker998 --worker-type RustWorker receptionist`
thread 'main' panicked at 'Log in via receptionist failed: gRPC error INVALID_ARGUMENT: Worker ID RustWorker998 has already been used. Please use a different one.', project-example\src\main.rs:30:19
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
error: process didn't exit successfully: `C:\Workspace\spatialos-sdk-rs\target\debug\project-example.exe --worker-id RustWorker998 --worker-type RustWorker receptionist` (exit code: 101)

My guess is some assumption is being broken when this errors.

jamiebrynes7 commented 5 years ago

@davedissian and found the root cause for this, upon creating the WorkerConnection object, we proactively grab the worker attributes to cache them:

            let sdk_attr = Worker_Connection_GetWorkerAttributes(connection_ptr);
            let attributes = ::std::slice::from_raw_parts(
                (*sdk_attr).attributes,
                (*sdk_attr).attribute_count as usize,
            )
            .iter()
            .map(|s| CStr::from_ptr(*s).to_string_lossy().to_string())
            .collect();

However, in the case where the connection attempt fails, Worker_Connection_GetWorkerAttributes() will return an un-initialised struct meaning that (*sdk_attr).attributes is a null pointer!

This should be a quick and easy fix however :smile:

dgavedissian commented 5 years ago

Actually, the returned object is initialized correctly, however it is default/zero initialized, so the returned const char** is NULL, so the from_raw_parts needs a null check.

jamiebrynes7 commented 5 years ago

Yeah what he said :stuck_out_tongue: