Closed Profpatsch closed 3 years ago
Okay we really need to get rid of the “always up to date clippy in CI” thing, it’s aggravating.
lorri internal stream-events --kind snapshot
just hangs for me.
I'm running the daemon with systemd, and paid attention that the lorri command and the lorri daemon are both from this PR.
lorri internal ping
works, though.
lorri internal stream-events --kind snapshot
just hangs for me.
have you been using this before? The logic around this is very confused, I copied it verbatim. I don’t think I broke anything, but it’s rather hard to figure out. I’d say this will need a rewrite to clean up in the future, so let’s ignore it for now.
Okay, again, the only CI failure is coming from the newer clippy warnings, which happens because CI always fetches the nightly clippy, which we should remove because they are not easy to reproduce locally.
LGTM
have you been using this before? The logic around this is very confused, I copied it verbatim. I don’t think I broke anything, but it’s rather hard to figure out.
Yes, since 2020-04-25 modulo command renaming (it used to be lorri internal__stream_events --kind snapshot
)
Here is at least half the fix #25
The command is still hanging, I don't remember if it used to.
I think I found the source of the problem: when thread
in ops::stream_events is dropped, we wait on it, that is forever. So the function cannot return.
I tried to fix it with
diff --git a/src/ops.rs b/src/ops.rs
index a7f3ca0..95ca37f 100644
--- a/src/ops.rs
+++ b/src/ops.rs
@@ -606,6 +606,12 @@ pub fn stream_events(kind: EventKind) -> OpResult {
})
};
+ // when running in snapshot mode, we should exit as soon as we print the answer
+ // of the daemon. However, when dropping `thread` at the end of the function,
+ // we will wait for it and hang indefinitely.
+ // Let's leak it, anyway we are going to exit the process as a whole very soon.
+ let thread = std::mem::ManuallyDrop::new(thread);
+
let mut snapshot_done = false;
loop {
select! {
but then lorri internal stream-events
fails with a timeout after 0.5 second:
$ target/debug/lorri internal stream-events
{"Completed":{"nix_file":"/home/symphorien/src/lorri/shell.nix","rooted_output_paths":{"shell_gc_root":"/home/symphorien/.cache/lorri/gc_roots/07d6e569120731e7ee9a3297613c71b7/gc_root/shell_gc_root"}}}
Mar 28 17:36:23.148 ERRO Message(R(Timeout(D(Millis(500)))))
Maybe it always used to timeout but when trying to return the error it would hang as explained above?
Mmh this leaks file descriptors very fast...
file descriptors in the server*
Ah, it’s a regression then, async will block until the thread inside finishes, since everything else is just sloppy.
@symphorien I think the solution here is to allow Async
to not require a thread to join itself when it goes out of scope. There is another solution, but that requires a bunch of backwards-pointing shutdown channels and I don’t think it’s worth the overhead.
Okay, I have a fix.
Fixes https://github.com/nix-community/lorri/issues/4
Please see commit messages.
The
lorri internal stream-events
output probably changes a lot, I didn’t spend any time keeping it backwards compatible. But since it’s experimental and internal we don’t have to care as much. It’s still using serde Serialize instances, but some care was taken to split it from the internal representation.release.nix
(seerelease.nix
for instructions)