Closed pouar closed 4 years ago
Not sure if reverting it is the proper fix though, but that seems to be where the bug was introduced, at least according to git bisect
Yeah, I'm going to really exhaust all possible alternatives before I'm reverting that one.
Are you sure you're really cleaning up everything between each step of the bisection?
It looks really unrelated to Helm completion interfaces. Though it could very well be, that's for sure. Stranger things have happened...
Also, I'm really quite busy at the moment, so no time to analyse this, even less if Helm is a requirement.
João
On Fri, Jan 10, 2020 at 4:51 PM pouar notifications@github.com wrote:
Not sure if reverting it is the proper fix though, but that seems to be where the bug was introduced, at least according to git bisect
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/joaotavora/sly/issues/303?email_source=notifications&email_token=AAC6PQYFPA7QILMWIMIALOLQ5CRRFA5CNFSM4KFKPHDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIUQWEY#issuecomment-573115155, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC6PQ4XS4HOFNLT72V7MKTQ5CRRFANCNFSM4KFKPHDA .
-- João Távora
so far. Also tried it on master with and without the commit reverted and recompiled the elisp files and restarted emacs each time.
Emacs 27/26/master?
On Fri, Jan 10, 2020 at 5:08 PM pouar notifications@github.com wrote:
so far. Also tried it on master with and without the commit reverted and recompiled the elisp files and restarted emacs each time.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
-- João Távora
Emacs 27, haven't checked 26
or Emacs master
Just did some experimenting with the issue and changing
(while (sit-for 30))
(setq cancelled t)
to
(let ((inhibit-quit t))
(while (sit-for 30))
(setq cancelled t))
Seems to make the problem go away
Good concise description of a possible solution.
I don't have time to analyse right now, i.e. to begin to (re)understand why inhibit-quit
in the sit-for
should be needed at all, so help me Obi-Wan-@monnier, you're my only hope.
Sorry, but there's some interference in the force right now.
I think I'd need some more detailed explanation of the way the control flows when things "work" and the way it flows when things "don't work".
The only description I see of the problem is "matches sometimes show up and sometimes don't" which is fairly vague, so don't really know where to start.
BTW, while we're moving the chairs on the deck, have you tried instead
of sit-for
to use accept-process-output
(passing a specific process
object to it) and wrapping it inside a while-no-input
. sit-for
has
various problems so I generally try to stay away from it.
Stefan
On Mon, Jan 20, 2020 at 6:25 PM monnier notifications@github.com wrote:
Sorry, but there's some interference in the force right now.
I think I'd need some more detailed explanation of the way the control flows when things "work" and the way it flows when things "don't work".
The only description I see of the problem is "matches sometimes show up and sometimes don't" which is fairly vague, so don't really know where to start.
BTW, while we're moving the chairs on the deck, have you tried instead of
sit-for
to useaccept-process-output
(passing a specific process object to it) and wrapping it inside awhile-no-input
.sit-for
has various problems so I generally try to stay away from it.
That's funny I have the exact same experience with while-no-input. I think I've been going back and forth between the two alternatives in that function. Maybe not systematically though. Anyway, I'll try again, but not anytime soon.
@pouar if you'd like to test this out, be my guest. But I think giving Stefan or Helm's author a killer, from-scratch reproduction recipe is also a good move. If you do try the change Monnier suggestgs, make sure to also give it plenty of non-Helm, company-heavy usage to increase confidence in the change.
Thanks, João
You mean like this?
(cancel-on-input
(while-no-input
(accept-process-output (sly-connection) 30)
(setq cancelled t)
(funcall check-conn)))
I think so, but can you give me context? A diff would be fine.
diff --git a/sly.el b/sly.el
index baf18aa4..d174139a 100644
--- a/sly.el
+++ b/sly.el
@@ -2404,9 +2404,10 @@ wants to input, and return CANCEL-ON-INPUT-RETVAL."
(throw catch-tag
(list #'error "Synchronous Lisp Evaluation aborted")))))
(cond (cancel-on-input
- (while (sit-for 30))
- (setq cancelled t)
- (funcall check-conn))
+ (while-no-input
+ (accept-process-output (sly-connection) 30)
+ (setq cancelled t)
+ (funcall check-conn)))
(t
(while t
(funcall check-conn)
Thanks . I think it has to loop forever in accept-process-output
, too.
it has to be (while (not (accept-process-output ... 30
or something to that effect, I think.
inside the while-no-input
that is.
Like this?
diff --git a/sly.el b/sly.el
index baf18aa4..beb9a6dd 100644
--- a/sly.el
+++ b/sly.el
@@ -2404,9 +2404,10 @@ wants to input, and return CANCEL-ON-INPUT-RETVAL."
(throw catch-tag
(list #'error "Synchronous Lisp Evaluation aborted")))))
(cond (cancel-on-input
- (while (sit-for 30))
- (setq cancelled t)
- (funcall check-conn))
+ (while-no-input
+ (while (not (accept-process-output (sly-connection) 30)
+ (setq cancelled t)
+ (funcall check-conn)))))
(t
(while t
(funcall check-conn)
No, that doesn't make sense (not
will only accept one arg). Don't worry, use this
diff --git a/sly.el b/sly.el
index 0ff8c0e0..304fc7f3 100644
--- a/sly.el
+++ b/sly.el
@@ -2384,9 +2384,10 @@ wants to input, and return CANCEL-ON-INPUT-RETVAL."
(sly-continuation-counter))))
(sly--stack-eval-tags (cons catch-tag sly--stack-eval-tags))
(cancelled nil)
+ (connection (sly-connection))
(check-conn
(lambda ()
- (unless (eq (process-status (sly-connection)) 'open)
+ (unless (eq (process-status connection) 'open)
(error "Lisp connection closed unexpectedly"))))
(retval
(unwind-protect
@@ -2404,7 +2405,8 @@ wants to input, and return CANCEL-ON-INPUT-RETVAL."
(throw catch-tag
(list #'error "Synchronous Lisp Evaluation aborted")))))
(cond (cancel-on-input
- (while (sit-for 30))
+ (while-no-input
+ (while (not (accept-process-output connection 30))))
(setq cancelled t)
(funcall check-conn))
(t
Now give this a good beating with Helm, company, whatever and report back here.
I kinda meant to wrap that argument in a progn
or whatever Emacs Lisp has as an equivalent
it has progn
but it doesn't make sense anyway. You want that inner loop to continue forever and ever until there is output from Lisp process.
catch-tag
.while-no-input
will break.Notice that in 1, you could theoretically do it with a while t
. The reason we don't is that I think I tried that before and theory somehow doesn't match practice in Emacs, because C-reasons. So we're basically just hunting in the dark here, following Stefan's heuristic.
your patch seems to be working so far, although I'm not sure what d4e52fe7fd31ed408bc60608416f785949b95133 was supposed to fix as I didn't run into anything, so I'm not sure what I'm looking for. is a "stale continuation" something like an infinite loop?
or did it drop the process on the Emacs side or something?
The only definition of continuation I'm aware of is the one from Scheme
although I'm not sure what d4e52fe was supposed to fix as I didn't run into anything
You probably didn't mean this as a criticism, but if you did, I would accept it. It wasn't fixing anything, it was just a "feel good" change that made SLY behave more like jsonrpc.el in Emacs. Until your Helm troubles I didn't experience any troubles.
A stale continuation is a request in Emacs that never seems to have gotten a reply from the server side, even an error response, including timeouts. It should never happen. An RPC request either succeeds or doesn't, by definition.
As for the nomenclature, continuations are not sophisticated as in Scheme's continuations, but they do point to the same effect: stop code execution and resume it later on seemingly magically. So when you
(sly-eval-async '(common-lisp-function-returning-foo-and-bar) (lambda (results) (cl-destructuring-bind (foo bar) ...)
the lambda is called a continuation.
Emacs-lisp does have something very close to Scheme continuations I think, see generator.el
. All of this is beside the issue of course. Please keep testing for a few days if possible with the latest code.
It wasn't really a criticism. I just didn't know what was going on.
It wasn't really a criticism.
I know. But if it had been, it would have been a good one :-)
ok, apparently it's still broken with this patch, but not as bad, as the problem now occurs less often
ok, maybe I didn't narrow it down to that last line, as it still shows up at about the same rate as using inhibit-quit
tbh, I have no idea what's going on
Thanks @pouar for your testing. As soon as I have some free time, I will re-focus on this. I will start by examining the commit sha in the subject of this issue very carefully, and possibly revert it.
@thierryvolpiatto writes in #303 that he can now reproduce this consistently. Thierry, can you cook up the smallest .emacs that demonstrates this bug, for those who don't have Helm installed (but may have a git clone of it somewhere)?
João Távora notifications@github.com writes:
@thierryvolpiatto writes in #303 that he can now reproduce this consistently. Thierry, can you cook up the smallest .emacs that demonstrates this bug, for those who don't have Helm installed (but may have a git clone of it somewhere)?
Sure.
1) Clone Sly.
git clone https://github.com/joaotavora/sly.git
2) Clone and install Async.
git clone https://github.com/jwiegley/emacs-async.git cd emacs-async make
3) Clone and install Helm.
git clone https://github.com/emacs-helm/helm.git cd helm make
3) Start Emacs
emacs -q
4) Configure helm
(add-to-list 'load-path "/path/to/async") (add-to-list 'load-path "/path/to/helm") (require 'helm-config) (helm-mode 1)
5) Configure Sly
(add-to-list 'load-path "/path/to/sly") (require 'sly-autoloads) (setq inferior-lisp-program "/usr/bin/sbcl") (add-hook 'sly-mode-hook (lambda () (sly-symbol-completion-mode -1)))
6) M-x sly
Enter something at repl prompt e.g. (sly and hit TAB Emacs is hanging for about 2 minutes and then fail silently to complete.
I used Emacs-27.1 on Linuxmint to reproduce this bug.
Thierry (Edited by @joaotavora)
Thanks very much @thierryvolpiatto for the thorough recipe.
NOTE: If you don't want to "make install" you will have to specify where async and helm are to load-path in 4).
Yes I think I prefer that. I'll edit your recipe, if you don't mind
I've started debugging this. The reproduction recipe that you gave original had sudo
, which I find a bit intrusive for Emacs stuff. I removed the sudo
, but it's still not perfect and needs an edit to Helm's Makefile
to add the load path for "emacs-async". After that, the command line:
emacs -Q -l <sly>/sly-autoloads.el -L <helm> -L <emacs-async> -l helm-config -f helm-mode -f sly -f sly-symbol-completion-mode
seems to start up and emacs where the bug can be reproduced.
There's some news here: I can't reproduce this in Emacs 26.3: it seems to work fine there. Something happened starting Emacs 27.1, where the bug is reproducible, but I can apparently recover if I send SIGTERM to the process.
So it seem this has to be debugged at the C level, probably with the help of Eli Zaretskii, the Emacs HEAD maintainer and C specialist.
~I did find an bug in SLY's :exit-function
but that is unrelated to the hang, just a bog-standard bug.~ Scratch that, there is no bug: I was loading Emacs 27.1 .elc's into Emacs 26.3 which brings some problems.
More progress. Even though this could be debugged at the C level and could be seens as a Emacs bug, I think it's also an Helm problem. Helm uses while-no-input
, or rather its own specific version of it. When I remove it, things seem to work OK with SLY. It's worth noting that Helm bypasses this while-no-input when talking to Tramp apparently. Perhaps it should also do so when talking to Sly. Reading its source, it's got so many special cases that I guess another one wouldn't hurt.
I'm starting to lean towards the possibility that the problem is on Helm's side, since SLY works work with bare Emacs, fido-mode, company, etc. Sly used to mess with inhibit-quit
and quit-flag
, and now it doesn't. Maybe Helm should follow suit? Anyway see https://github.com/emacs-helm/helm-sly/issues/2 for the possible beginnings of a patch for Helm.
João Távora notifications@github.com writes:
I'm starting to lean towards the possibility that the problem is on Helm's side, since SLY works work with bare Emacs, fido-mode, company, etc.
No, the problem is not in helm, the problem is in sly-eval
, just
commenting the offending cond clause fixes the bug:
diff --git a/sly.el b/sly.el
index 020005dc..b947f1c6 100644
--- a/sly.el
+++ b/sly.el
@@ -2399,10 +2399,10 @@ wants to input, and return CANCEL-ON-INPUT-RETVAL."
(unless cancelled
(throw catch-tag
(list #'error "Synchronous Lisp Evaluation aborted")))))
- (cond (cancel-on-input
- (while (sit-for 30))
- (setq cancelled t)
- (funcall check-conn))
+ (cond ;; (cancel-on-input
+ ;; (while (sit-for 30))
+ ;; (setq cancelled t)
+ ;; (funcall check-conn))
(t
(while t
(funcall check-conn)
So the bug comes from there, using (while (sit-for 30))
seems really
hacky and is probably the cause of the problem.
Sly used to mess with inhibit-quit and quit-flag, and now it doesn't. Maybe Helm should follow suit? Anyway see emacs-helm/helm-sly#2 for the possible beginnings of a patch for Helm.
So no, disabling while-no-input in helm is not a solution.
Thanks to work on this.
-- Thierry
This patch fixes the problem with helm with probably not affecting others (company etc... not tested):
diff --git a/sly.el b/sly.el
index 020005dc..adbcf61a 100644
--- a/sly.el
+++ b/sly.el
@@ -2380,6 +2380,7 @@ wants to input, and return CANCEL-ON-INPUT-RETVAL."
(sly-continuation-counter))))
(sly--stack-eval-tags (cons catch-tag sly--stack-eval-tags))
(cancelled nil)
+ (inhibit-quit t)
(check-conn
(lambda ()
(unless (eq (process-status (sly-connection)) 'open)
@@ -2399,7 +2400,8 @@ wants to input, and return CANCEL-ON-INPUT-RETVAL."
(unless cancelled
(throw catch-tag
(list #'error "Synchronous Lisp Evaluation aborted")))))
- (cond (cancel-on-input
+ (cond ((and cancel-on-input
+ (not (minibufferp (window-buffer))))
(while (sit-for 30))
(setq cancelled t)
(funcall check-conn))
No, the problem is not in helm, the problem is in
sly-eval
, just commenting the offending cond clause fixes the bug:
And breaks the rest of SLY obviously, so it won't work. Why is sit-for 30
really hacky?
This patch fixes the problem with helm with probably not affecting others (company etc... not tested):
Why should SLY mess with inhibit-quit
when it doesn't need to? Why should Helm? My point it: let's both not mess with these Emacs internals unnecessarily. SLY already does not, Helm does that sometimes, just expand the amount of times that Helm doesnt' mess with it. Simple.
(Sorry, I closed by accident).
Can you at explain to us what is happenning? Why does the inhibit-quit
fix it?
(company etc... not tested)
Obviously that's not practical
So no, disabling while-no-input in helm is not a solution.
But you already do, I used your macro helm--maybe-while-no-input
in that patch I sent you. It seems you've disabled it for TRAMP.
Fiddling with sit-for, inhibit-quite, and while-no-input is like the whack-a-mole game (as well as a fair bit of back and forth over the years as one forgets past attempts and goes through them again), so I think it's important when doing that to try and record the reasons behind those, what was tried, what were the problems, etc...
Ideally, the better way to "record" is via regression tests, but since it's often difficult to make reproducible batch tests of those problems, the second best option are comments.
Why is
sit-for 30
really hacky?
If you intend to wait for user input, it's fine, but if you're waiting
for a process's response accept-process-output
is the less-hacky way.
[ That's not to say that accept-process-output
always works better
for that, but if it doesn't work well, it's probably a sign that
there's a bug in the C code. ]
I think it's important when doing that to try and record the reasons behind those, what was tried
As to what was tried: a lot of stuff.
As to why sit-for
is needed there: we need something that will block until the user types or does anything. But, while blocking, we want the network process to do its job.
I decided to not use inhibit-quit
and while-no-input
and such functions: sit-for
has existed for a long time.
If you intend to wait for user input, it's fine, but if you're waiting for a process's response
accept-process-output
is the less-hacky way.
Right, I do need to wait for user input, so I can CANCEL-ON-INPUT
as the function promises to. I tried while-no-input
+accept-process-output
but it turned out more problematic for other reasons (the whack-a-mole metaphor applies). So I settled on the simplest sit-for
.
João Távora notifications@github.com writes:
Can you at explain to us what is happenning?
With (while (sit-for n)) you block the minibuffer and when helm tries to start it fails at initial update with the computation beeing inside a while-no-input.
Why does the inhibit-quit fix it?
inhibit-quit makes with-local-quit behaving differently, prevents quitting while helm is updating its candidates. But I don't understand enough the Emacs internal to tell you the interaction with sit-for (and read-event).
Note that if you are affraid using inhibit-quit, using (while (accept-process-output nil 30)) fixes the bug as well (seems it doesn't block the minibuffer but block input), I see you are already using it in next cond clause, perhaps you can use it in this clause as well? (but perhaps I miss something).
(company etc... not tested)
Obviously that's not practical
What is not practical?
-- Thierry
João Távora notifications@github.com writes:
I think it's important when doing that to try and record the reasons behind those, what was tried
As to what was tried: a lot of stuff.
As to why sit-for is needed there: we need something that will block until the user types or does anything. But, while blocking, we want the network process to do its job.
I decided to not use inhibit-quit and while-no-input and such functions: sit-for has existed for a long time.
If you intend to wait for user input, it's fine, but if you're waiting for a process's response accept-process-output is the less-hacky way.
Right, I do need to wait for user input, so I can CANCEL-ON-INPUT as the function promises to. I tried while-no-input+accept-process-output but it turned out more problematic for other reasons (the whack-a-mole metaphor applies). So I settled on the simplest sit-for.
(while (accept-process-output nil 30)) is working fine with sly-symbol-completion-mode, helm-mode and company-mode and also regular emacs vanilla completion of course.
-- Thierry
Thierry Volpiatto notifications@github.com writes:
(while (accept-process-output nil 30)) is working fine with sly-symbol-completion-mode, helm-mode and company-mode and also regular emacs vanilla completion of course.
It's not, Thierry, it's not working "fine" becasue it will not return immediately when the user presses a key. And that's, pardon the pun, "key" for responsive behaviour.
Didn't you find it curious that by doing that, then the documented prominently documented CANCEL-ON-INPUT in the function's docstring would be completely useless?
João
In case you aren't aware, flex completion in Sly when using Helm was working again, at least until this commit, I don't remember whether the fix was on Sly's side or Helm's side.
In d4e52fe7fd31ed408bc60608416f785949b95133, matches sometimes show up and sometimes don't. Reverting the commit seems to fix it.
EDIT BY SLY's AUTHOR: This problem has a solution in this comment in terms of a small ad-hoc fix to Helm's code. EDIT2: The problem now has a fix in SLY proper.