containerd / ttrpc-rust

Rust implementation of ttrpc (GRPC for low-memory environments)
Apache License 2.0
195 stars 45 forks source link

[Windows] Closing connections properly on Windows #185

Closed jsturtevant closed 1 year ago

jsturtevant commented 1 year ago

The current closing logic was causing a deadlock becuase an unused pipe was being returned when the close was called on the listenering. This sets a flag so we can detect that we should not use that pipe connection which wasn't actually connected to a client but returned from the async call becuase we told it too.

more info

I was using the skeleton example to test rust-extensions and found that there was a race condition when shutting the shim down.

# start shim
cargo run --example skeleton -- -namespace default -id 1234 -address "\\.\pipe\containerd-containerd"  -publish-binary ./bin/containerd start

# send disconnect
cargo run --example shim-proto-connect \\.\pipe\containerd-shim-17630016127144989388-pipe
Connecting to \\.\pipe\containerd-shim-17630016127144989388-pipe...
Sending `Connect` request...
Connect response: ConnectResponse { shim_pid: 0, task_pid: 0, version: "example", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }
Sending `Shutdown` request...
Shutdown response: Empty { special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }

# shim is still running
ps *skeleton*

Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName
-------  ------    -----      -----     ------     --  -- -----------
     71       5      812       4396       0.00   5932   1 skeleton

Logs (with added logging) show that there are two (?!) clients but the code only connects once.

 Get-Content log -Tail 200 -Wait
[INFO] server listen started
[INFO] server started
[INFO] Shim successfully started, waiting for exit signal...
[INFO] Got new client
[INFO] thread start for pipe 168 ThreadId(8)
[INFO] entered while loop 168
[INFO] thread start for pipe 168 ThreadId(9)
[INFO] entered while loop 168
[INFO] thread start for pipe 168 ThreadId(10)
[INFO] entered while loop 168
[INFO] waiting for control call from handler threads
[INFO] thread start for pipe 168 ThreadId(11)
[INFO] entered while loop 168
[INFO] thread start for pipe 168 ThreadId(13)
[INFO] entered while loop 168
[INFO] waiting for control call from handler threads
[INFO] Connect request
[INFO] thread start for pipe 168 ThreadId(12)
[INFO] entered while loop 168
[INFO] entered while loop 168
[INFO] thread exit for connection 168 ThreadId(8)
[INFO] Shutdown request
[INFO] Shutting down shim instance
[INFO] close monitor
[INFO] entered while loop 168
[ERROR] listener accept got Custom { kind: Other, error: "listener shutdown for quit flag" }
[INFO] ttrpc server listener stopped
[INFO] Got new client
[INFO] listener thread stopped
[INFO] begin to shutdown connection
[INFO] shutdown pip 168
[INFO] closing pipe 168
[INFO] shutdown pip 280
[INFO] waiting for control call from handler threads
[INFO] thread start for pipe 280 ThreadId(17)
[INFO] thread start for pipe 280 ThreadId(16)
[INFO] thread exit for connection 280 ThreadId(16)
[INFO] closing pipe 280
[INFO] thread exit for connection 280 ThreadId(17)
[INFO] thread exit for connection 168 ThreadId(10)
[INFO] quitting client thread.
[INFO] thread start for pipe 280 ThreadId(18)
[INFO] thread exit for connection 280 ThreadId(18)
[INFO] thread start for pipe 280 ThreadId(19)
[INFO] thread exit for connection 280 ThreadId(19)
[INFO] thread exit for connection 168 ThreadId(13)
[INFO] thread exit for connection 168 ThreadId(12)
[INFO] thread exit for connection 168 ThreadId(9)
[INFO] connections closed
[INFO] thread exit for connection 168 ThreadId(11)
[INFO] reaper recieved 168
[INFO] closing pipe 168
[INFO] closing pipe 168

you can see there are two (!?) connections being handled, this is because the accept was returning an invalid handle when closing up causing the race condition on shutdown.

With this new logic we can see there is only ever one client (268):

[INFO] Shim successfully started, waiting for exit signal...
[INFO] listening...
[INFO] Got new client: 268
[INFO] waiting for control call from handler threads
[INFO] start handler for connection id 268 on thread ThreadId(8)
[INFO] entered while loop 268
[INFO] Connect request
[INFO] start handler for connection id 268 on thread ThreadId(9)
[INFO] entered while loop 268
[INFO] start handler for connection id 268 on thread ThreadId(15)
[INFO] entered while loop 268
[INFO] entered while loop 268
[INFO] waiting for control call from handler threads
[INFO] start handler for connection id 268 on thread ThreadId(13)
[INFO] entered while loop 268
[INFO] start handler for connection id 268 on thread ThreadId(12)
[INFO] entered while loop 268
[INFO] start handler for connection id 268 on thread ThreadId(10)
[INFO] entered while loop 268
[INFO] start handler for connection id 268 on thread ThreadId(11)
[INFO] entered while loop 268
[INFO] exit handler for connection id 268 on thread ThreadId(11)
[INFO] start handler for connection id 268 on thread ThreadId(14)
[INFO] entered while loop 268
[INFO] exit handler for connection id 268 on thread ThreadId(14)
[INFO] start handler for connection id 268 on thread ThreadId(16)
[INFO] entered while loop 268
[INFO] exit handler for connection id 268 on thread ThreadId(16)
[INFO] Shutdown request
[INFO] Shutting down shim instance
[INFO] close monitor
[INFO] entered while loop 268
[ERROR] listener accept got Custom { kind: Other, error: "listener shutdown for quit flag" }
[INFO] ttrpc server listener stopped
[INFO] exit handler for connection id 268 on thread ThreadId(9)
[INFO] listener thread stopped
[INFO] begin to shutdown connection
[INFO] connections closed
[INFO] exit handler for connection id 268 on thread ThreadId(15)
[INFO] quitting client thread.
[INFO] exit handler for connection id 268 on thread ThreadId(8)
[INFO] exit handler for connection id 268 on thread ThreadId(13)
[INFO] exit handler for connection id 268 on thread ThreadId(12)
[INFO] exit handler for connection id 268 on thread ThreadId(10)
[INFO] reaper recieved message for 268
[INFO] reaper thread exited
[INFO] reaper thread stopped
[INFO] server listen started
[INFO] server started
[INFO] Shim successfully started, waiting for exit signal...
[INFO] Connect request
[INFO] Shutdown request
[INFO] Shutting down shim instance
[INFO] close monitor
[ERROR] listener accept got Custom { kind: Other, error: "closing pipe" }
[INFO] ttrpc server listener stopped
[INFO] listener thread stopped
[INFO] begin to shutdown connection
[INFO] connections closed
[INFO] reaper thread exited
[INFO] reaper thread stopped
codecov[bot] commented 1 year ago

Codecov Report

Patch and project coverage have no change.

Comparison is base (4661bd1) 24.39% compared to head (c9f67f5) 24.39%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #185 +/- ## ======================================= Coverage 24.39% 24.39% ======================================= Files 17 17 Lines 2529 2529 ======================================= Hits 617 617 Misses 1912 1912 ``` | [Impacted Files](https://app.codecov.io/gh/containerd/ttrpc-rust/pull/185?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=containerd) | Coverage Δ | | |---|---|---| | [src/error.rs](https://app.codecov.io/gh/containerd/ttrpc-rust/pull/185?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=containerd#diff-c3JjL2Vycm9yLnJz) | `55.55% <0.00%> (ø)` | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.