mafredri / zsh-async

Because your terminal should be able to perform tasks asynchronously without external tools!
MIT License
766 stars 34 forks source link

stderr stays mapped to /dev/null #35

Closed rgouicem closed 4 years ago

rgouicem commented 5 years ago

I'm using zsh-async through alien (https://github.com/eendroroy/alien). alien is a prompt theme that uses zsh-async to update the git part of the prompt asynchronously.

Sometimes, I noticed that stderr was no longer printed in my terminal (with echo foo >&2). After some investigation, I realized that stderr was actually mapped to /dev/null (with ls -l /proc/self/fd).

While searching for what caused this, I found this line: https://github.com/mafredri/zsh-async/blob/95c2b1577f455728ec01cec001a86c216d0af2bd/async.zsh#L71

Commenting it resolved my problem. I don't know if it is a good way to resolve my issue, this actual errors that should be hidden would be printed.

mafredri commented 4 years ago

@rgouicem what version of zsh and operating system did you experience this on? It might be a zsh bug.

The async worker is a clone of your shell but it shouldn't be able to affect its file descriptors. I haven't seen this happen on any of my systems.

rgouicem commented 4 years ago

I'm currently using zsh 5.7.1 on Arch Linux (might have been updated since my post).

I'm also thinking it might be a zsh bug since it happened again a few times (less frequently than before my fix, but still).

I'm gonna try to investigate a little bit more and maybe open an issue for zsh.

fd0 commented 4 years ago

I observe the same issue, zsh 5.7.1 on Debian stable and zsh 5.8 on Arch. I'm using the sorin prompt though, zsh-async is included via the prezto framework. The current commit for zsh-async is 95c2b1577f455728ec01cec001a86c216d0af2bd.

$ ls -al /proc/$$/fd
total 0
dr-x------ 2 fd0 users  0 Mai  2 11:22 .
dr-xr-xr-x 9 fd0 users  0 Mai  2 11:22 ..
lrwx------ 1 fd0 users 64 Mai  2 11:22 0 -> /dev/pts/2
lrwx------ 1 fd0 users 64 Mai  2 11:22 1 -> /dev/pts/2
l-wx------ 1 fd0 users 64 Mai  2 11:22 2 -> /dev/null

Any ideas on how to debug this further? It is very annoying :)

mafredri commented 4 years ago

@fd0 oh yeah, that would be super annoying. I'd like to be able to reproduce this somehow, but so far I haven't been able to. I'm wondering if it can be something specific to your .zshrc that is causing it because this really shouldn't happen. We're only redirecting stderr in one place, and that's happening in a forked process. I.e. https://github.com/mafredri/zsh-async/blob/95c2b1577f455728ec01cec001a86c216d0af2bd/async.zsh#L71

I have two ideas on what we could try though. The first is to modify this section:

https://github.com/mafredri/zsh-async/blob/95c2b1577f455728ec01cec001a86c216d0af2bd/async.zsh#L489-L492

With:

    exec {errfd}>&2
    zpty -b $worker _async_worker -p $$ $args 2>&$errfd || {
        async_stop_worker $worker
        return 1
    }
    exec {errfd}>& -

The idea being that we're creating a temporary fd that hopefully is active when the zpty is created, and thus will not reference back to our main stderr fd.

The second is to just add a random sleep N before the exec 2>/dev/null. The idea then being that there might be a race when the zpty is created, and some statements have time to affect the parent process.

rgouicem commented 4 years ago

Thanks for the fix @mafredri ! I have been trying it for 10 days now, and I don't think I encountered the problem since. I'm not 100% sure because I still have the reflex to run exec 2>$TTY from time to time so I might have done this unconsciously.

mafredri commented 4 years ago

@rgouicem that's great news, thanks for reporting. Which fix were you using btw?

rgouicem commented 4 years ago

The first one, not the sleep N.

mafredri commented 4 years ago

I went ahead and released this now as v1.8.1. Let's hope it works for others too, let me know if any issues pop up!

jamesoff commented 4 years ago

I have NOCLOBBER set and this change seems to cause a warning with it:

async_start_worker:38: can't clobber parameter errfd containing file descriptor 0
mafredri commented 4 years ago

@jamesoff I can't reproduce the issue (with noclobber enabled), could you provide reproduction steps?

jamesoff commented 4 years ago

Definitely doing setopt clobber in my zshrc before anything uses async fixes the issue. It's most notably showing up when my shell starts (I check for active tmux sessions async to remind me they're available to attach to), and that goes away when I change the option. Any hints on how to debug this appreciated, I'm happy to start digging in to my config and the code but I'm not sure where to start. Does this code only execute on the failure of a job (to help me figure out the path which may be leading to it)?

Oddly, it's also causing neovim to segfault as it starts, which I imagine is due to one of my plugins shelling out and being affected in some fashion. Again, setopt clobber fixes that. I'll have a go at building it with debug symbols so I can see where it's failing, but my C debugging skills are worse than my zsh ones ;)

(Edit: not a minimum reproducer, but in case something stands out to you, here's my config, pointing to where I'm using async: https://github.com/jamesoff/zsh/blob/master/.zshrc#L227)

mafredri commented 4 years ago

@jamesoff this kinda fell off my radar, sorry about that. But I believe I might have a fix for your issue in c7f35ec. Would you mind taking a look?

jamesoff commented 4 years ago

I've updated my install and have re-enabled noclobber in my config. So far so good, I don't see the warning on starting my shell now. I can also start neovim without segfault :)