Closed fleimgruber closed 2 years ago
Hi @fleimgruber , thanks for reporting this!
From the logs you provided, it looks like a handful of player processes (itk
, ypb
and zxc
) recorded that they were in the starting
state and then they apparently died before they were able to log anything. :thinking: In the log, there are only entries for 2 player processes, clp
and tta
, and no error messages for those 2 processes.
I do think that Alda can self-correct to recover from a situation like this, but it takes 2 minutes. Every time you run an alda
command, it looks for "stale" player state files (which means any files in ~/.cache/alda/state/players/<VERSION>
that were last modified more than 2 minutes ago) and deletes them. Alda will then start new player processes to replace the stale/dead ones.
Would you mind running watch -n 0.25 alda ps
and keep an eye on the output, and observe what happens when your player processes get stuck? If you wait 2 minutes, do the starting
processes disappear from the alda ps
output?
Another thing that might be helpful is to monitor the actual processes on your system by running watch -n 0.25 pgrep -af alda-player
. If the number of player processes reported by alda ps
is more than the number of java
processes in the pgrep
output, that would confirm my theory above about the player processes dying without cleaning up their state files.
Thanks for taking a look at this!
watch -n 0.25 alda ps
Every 0.2s: alda ps
id port state expiry type
itk 44295 starting 4 hours ago player
ypb 37647 starting 4 hours ago player
zxc 41825 starting 4 hours ago player
alda play --code "midi-bass-pop: c6 d12 e6 g12~4"
does not change anything in alda ps
and prints the mentioned
Starting player processes...
It looks like Alda is having trouble starting player processes in the
background. This could happen for a number of reasons.
To troubleshoot:
...
watch -n 0.25 pgrep -af alda-player
9411 java -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -Xmx1024m -Xms256m -DlogPath=tmplog -jar /nix/store/04p5x5y0lrs4ppz9
9clnhx5ddkhvvv9q-alda-2.0.6/bin/.alda-player-wrapped run
9497 java -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -Xmx1024m -Xms256m -DlogPath=tmplog -jar /nix/store/04p5x5y0lrs4ppz9
9clnhx5ddkhvvv9q-alda-2.0.6/bin/.alda-player-wrapped run
9499 java -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -Xmx1024m -Xms256m -DlogPath=tmplog -jar /nix/store/04p5x5y0lrs4ppz9
9clnhx5ddkhvvv9q-alda-2.0.6/bin/.alda-player-wrapped run
it looks for "stale" player state files (which means any files in ~/.cache/alda/state/players/
that were last modified more than 2 minutes ago) and deletes them.
The associated player state files seem to get updated somehow and thus not deleted by the logic you mentioned:
ls -la ~/.cache/alda/state/players/2.0.6/itk.json
-rw-r--r-- 1 fps users 64 Mar 20 19:23 /home/fps/.cache/alda/state/players/2.0.6/itk.json
Interesting! So it looks like those 3 processes are not dying, rather, they are recording that they're in the starting
state and then apparently not getting past that point.
I'm not sure at the moment how that could happen :thinking:
Just to confirm: the first few times you ran alda play ...
, the playback worked?
Another thing worth trying: run alda shutdown
to send a "shutdown" message to each of the stuck player processes. I'm curious if that works. If it does, it means that the player processes are actually running, but they aren't updating their state files.
Just to confirm: the first few times you ran alda play ..., the playback worked?
It is not reliable, sometimes the playback only works for one or two of those and subsequent ones are silent. Sometimes they work a couple of times (usually if I leave ample time between them).
Another thing worth trying: run alda shutdown to send a "shutdown" message to each of the stuck player processes.
alda shutdown
does not shut down the "starting" processes.
It might have to do with NixOS, the way it wraps alda or some weird sound system interaction? See also https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/interpreters/alda/default.nix. If we don't find a lead or a problem with alda itself, I will open an issue there as well just to confirm it is not a NixOS bug...
alda shutdown
does not shut down the "starting" processes.
Is there any error or warning message? Can you try it with alda -v3 shutdown
?
It might have to do with NixOS, the way it wraps alda or some weird sound system interaction? See also https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/interpreters/alda/default.nix. If we don't find a lead or a problem with alda itself, I will open an issue there as well just to confirm it is not a NixOS bug...
I don't really understand NixOS. It could very well be the case that something weird is happening involving the way the Alda Nix package is set up. I figure it wouldn't hurt to at least open an issue with Nix, link to this issue and see if they have any ideas.
Is there any error or warning message? Can you try it with alda -v3 shutdown?
$ alda -v3 shutdown
Shutting down player processes.
Mar 21 18:42:26 INF cmd/shutdown.go:68 > Sent "shutdown" message to player process. player={"Expiry":1647885032161,"ID":"naj","Port":40575,"ReadError":null,"State":"starting"}
Mar 21 18:42:26 DBG cmd/root.go:53 > Waiting for background activity to complete. activity="clean up renamed executables"
$ alda ps
id port state expiry type
naj 40575 starting 7 minutes from now player
OK, so that looks like the client successfully sent the player the "shutdown" message, which means the player was listening and received the message. :thinking:
But then the player didn't shut down, and it remained in the starting state?
But then the player didn't shut down, and it remained in the starting state?
Yep:
$ alda ps
id port state expiry type
naj 40575 starting 5 minutes from now player
Might this give a hint activity="clean up renamed executables"
combined with NixOS handling of binary paths (basically symlinks to derivations, which roughly correspond to a system installation state, comparable to a git commit)?
That part is unrelated. It's an implementation detail of the way that alda update
works -- it renames the current executable (old version) of Alda and downloads the new version in its place. Then the next time you run Alda, it deletes the old, renamed executables.
I noticed another thing now: When I let it sit for a while, the timer counts backwards and beyond, for example:
$ alda ps
id port state expiry type
naj 40575 starting 50 seconds ago player
when a new shutdown is sent afterwards via alda -v3 shutdown
(the command output is the same) then the timer is reset to 8 minutes:
$ alda ps
id port state expiry type
naj 40575 starting 8 minutes from now player
maybe this gives an indication of which code path was hit.
Good find! Every time a player process receives any message, it starts the timer over. This is an inactivity timer designed to prevent player processes from running forever in the background and wasting your CPU.
Once the timer reaches 0, the process should shut itself down and when you run alda ps
again, you shouldn't see it there.
It seems like for some reason, the player process is having trouble getting past the starting
state and is not responding to "shutdown" messages. :thinking:
I tried updating to 2.2.0 and can't reproduce the issue there. I sent a PR to NixOS upstream so this might have been solved in newer alda along the way already.
Weird, but I'm happy that the issue is fixed!
player processes get stuck in "starting" state and no further playback is possible
๐ Bug report ๐
Description
After one or two
alda play -c ...
the processes get stuck in the starting state and no further playback is possible.Steps to Reproduce
Yes, but the number of stuck processes varies from 2-3.
Exact score does not matter, I tried with
alda -v2 play -c "piano: c12 e g > c4"
andalda play -c "midi-bass-pop: c6 d12 e6 g12~4"
Expected Behavior
Sound plays consistently after each call to
alda play ...
Actual Behavior
Environment
Operating system and version:
GNU/Linux NixOS 21.11
Alda version:
Health check:
Logs: