mafredri / zsh-async

Because your terminal should be able to perform tasks asynchronously without external tools!
MIT License
756 stars 37 forks source link

async_job not starting job #33

Open Kronuz opened 5 years ago

Kronuz commented 5 years ago

I've been using zsh-async on a theme (based on zprezto's sorin prompt) but I've frequently noticed async stops working.

I only initialize the worker once, during precmd:

  # Initialize async worker. This needs to be done here and not in
  # prompt_kronuz_setup so the git formatting can be overridden by other prompts.
  if (( !${prompt_kronuz_async_init:-0} )); then
    async_start_worker "prompt_kronuz" -n
    async_register_callback "prompt_kronuz" prompt_kronuz_async_callback
    typeset -g prompt_kronuz_async_init=1
  fi

  # Kill the old process of slow commands if it is still running.
  async_flush_jobs "prompt_kronuz"

  # Compute slow commands in the background.
  async_job "prompt_kronuz" prompt_kronuz_async_git "$PWD"

Not knowing why, I added some logging and I'm now seeing async_job is being called, but the job function (prompt_kronuz_async_git) is not... as if the worker had died?

Is there a way I could list active workers? ...or how can I know if worker died? ...and why?

Edit: If I call zpty, I see this: (finished) prompt_kronuz: _async_worker -p 89022 -n... which sure enough shows the async worker as finished. Is there a way to know why, or if it's finished automatically restart it?

mafredri commented 5 years ago

I don't see anything peculiar or wrong about your usage, so it could be a bug or an issue with the zsh zpty module. What version of zsh are you using?

To debug this issue further, it might be good to add some logging, perhaps you could apply the following patch:

diff --git a/async.zsh b/async.zsh
index fb273b1..1906d7e 100644
--- a/async.zsh
+++ b/async.zsh
@@ -68,7 +68,7 @@ _async_worker() {
    # worker. For example: `fork failed: resource temporarily unavailable`.
    # Some older versions of zsh might also print malloc errors (know to happen
    # on at least zsh 5.0.2 and 5.0.8) likely due to kill signals.
-   exec 2>/dev/null
+   exec 2>>/tmp/async_worker-err.log

    # When a zpty is deleted (using -d) all the zpty instances created before
    # the one being deleted receive a SIGHUP, unless we catch it, the async
@@ -157,6 +157,8 @@ _async_worker() {
            # occurs, a failure to read probably indicates that stdin has
            # closed. This is why we propagate the signal to all children and
            # exit manually.
+           print Exit code: $? >>/tmp/async_worker_exit.log
+           typeset -p request &>>/tmp/async_worker_exit.log
            kill -HUP -$$  # Send SIGHUP to all jobs.
            exit 0
        }

And see if the log files produce anything of interest?

You might also want to add a "catch all" to your callback function and log it, as async can sometimes inform of errors that have occurred, like: async.zsh#L320 for example.

I wonder if this could be related to https://github.com/sindresorhus/pure/issues/455. Although, in that issue, restarting the worker does not help in my experience (only a new shell does).

Kronuz commented 5 years ago

I've added those prints, and a few others... but I still don't see the reason why after a while it stops working. In the logs I do see the Exit code: 1 and typeset request='', after adding the suggestion above. Partial log follows, after the Exit message, seems like async_job is requested, but it never gets to the other end (added more logs to print zpty commands):

prompt_kronuz_precmd
zpty -w prompt_kronuz _killjobs
async_job "prompt_kronuz" prompt_kronuz_async_git "/Users/kronuz/repository"
zpty -w prompt_kronuz prompt_kronuz_async_git /Users/kronuz/repository
prompt_kronuz_async_git /Users/kronuz/repository
prompt_kronuz_async_callback prompt_kronuz_async_git 0  %F{8}git%b%u%s%f%k%F{8}:%B%F{231}master%b%u%s%f%k%F{8}:%F{231}origin/master%b%u%s%f%k%F{8}(%b%u%s%f%k%F{94}✗%b%u%s%f%k%F{118}⇡%b%u%s%f%k%F{1}✴%b%u%s%f%k %F{8}⊖%b%u%s%f%k%F{8})%b%u%s%f%k 0.1490490437  0
Exit code: 1
typeset request=''
prompt_kronuz_precmd
async_job "prompt_kronuz" prompt_kronuz_async_git "/Users/kronuz/repository"
zpty -w prompt_kronuz prompt_kronuz_async_git /Users/kronuz/repository
prompt_kronuz_precmd
async_job "prompt_kronuz" prompt_kronuz_async_git "/Users/kronuz/repository"
zpty -w prompt_kronuz prompt_kronuz_async_git /Users/kronuz/repository

Running zpty in the same broken terminal shows:

(finished) prompt_kronuz: _async_worker -p 59805 -n
Kronuz commented 5 years ago

With debug logs enabled, I'm also experiencing async_flush_jobs a lot (when I run certain commands)... I can't tell if this is related, just wanted to point it out as it might.

mafredri commented 4 years ago

I believe this issue is caused by async_flush_jobs and might be related to https://github.com/sindresorhus/pure/issues/455.

My guess is that there are certain cases where the worker can't handle the kill -HUP -$$ (a kill to it's own process group) which leaves it in a broken state triggering an early exit. This should be handled by our TRAPHUP() which returns 0, meaning we handled it. And it does seem to handle it but something else breaks in the process, leading the next read to exit with a non-zero status.

For now, I've created a branch of zsh-async (#37) which can give you an indication when the worker has exited. This indication can be used to restart the worker, and in the future maybe async can restart the worker automatically.

For an example of how this is detected, you could check our the indestructible-pure branch