Closed CrossR closed 4 years ago
Strange! I just tested on Catalina (the latest release, and the downloaded from master
), and am not reproducing it.
It's interesting that it reproduces for both the released builds and dev builds. I wonder if there is a bug with node 13 and we happen to be jumping to the system one instead of the vendored version.
Some things that might help track it down:
ONI2_DEBUG=1
)--checkhealth
I'll also add some CLI commands to:
--no-syntax-highlighting
)--no-extensions
)--no-configuration
)so that we can at least narrow down whether it is syntax highlighting vs extension host, or possibly configuration.
I actually only get this behaviour in my dev build, though someone on discord had mentioned syntax was dead on master in the release build. Weirdly for me, the dev build doesn't work and the release build does.
Check health is as follows:
┌─[16:49:40 MacBook (master $) ~/git/oni2]
└─╼ esy run run -f --checkhealth
[DEBUG] +2ms Global : Setup: Looking for setupJson at: /Users/ryanc/git/oni2/_esy/default/store/i/oni2-5ca89004/bin/setup.json
[INFO] +0ms Global : RUNNING CHECK: Verify camomile:datadir
[INFO] +0ms Global : -- RESULT: PASS
[INFO] +0ms Global : RUNNING CHECK: Verify camomile:localedir
[INFO] +0ms Global : -- RESULT: PASS
[INFO] +0ms Global : RUNNING CHECK: Verify camomile:charmapdir
[INFO] +0ms Global : -- RESULT: PASS
[INFO] +0ms Global : RUNNING CHECK: Verify camomile:unimapdir
[INFO] +0ms Global : -- RESULT: PASS
[INFO] +0ms Global : RUNNING CHECK: Verify node executable
[INFO] +0ms Global : -- RESULT: PASS
[INFO] +0ms Global : RUNNING CHECK: Verify node executable can execute simple script
[INFO] +111ms Global : -- RESULT: PASS
[INFO] +0ms Global : RUNNING CHECK: Verify node dependencies
[INFO] +131ms Global : -- RESULT: PASS
[INFO] +0ms Global : RUNNING CHECK: Verify ripgrep (rg) executable
[INFO] +0ms Global : -- RESULT: PASS
[INFO] +0ms Global : RUNNING CHECK: Verify bundled extensions path exists
[INFO] +0ms Global : -- RESULT: PASS
[INFO] +0ms Global : RUNNING CHECK: Verify bundled font exists
[INFO] +0ms Global : -- RESULT: PASS
[INFO] +0ms Global : RUNNING CHECK: Verify bundled reason-language-server executable
[INFO] +24ms Global : -- RESULT: PASS
[INFO] +0ms Global : RUNNING CHECK: Verify bundled syntax server
[INFO] +0ms Oni2.SyntaxClient : Starting executable: /Users/ryanc/git/oni2/_esy/default/store/i/oni2-5ca89004/bin/Oni2_editor
[INFO] +1ms Oni2.SyntaxClient : started syntax client
[INFO] +61ms Oni2.SyntaxServer : Starting up server. Parent PID is: 32256
[INFO] +0ms Oni2.SyntaxServer : Waiting for incoming message...
[INFO] +0ms Oni2.SyntaxServer : Initialized!
[INFO] +0ms Oni2.SyntaxServer : No pending work.
[INFO] +0ms Oni2.SyntaxClient : Tokens applied
[INFO] +0ms Oni2.SyntaxServer : Token updates sent.
[INFO] +0ms Oni2.SyntaxServer : Waiting for incoming message...
[INFO] +185ms Oni2.SyntaxServer : Closing
[ERROR] +2ms Oni2.SyntaxClient : Syntax process closed with exit code: 32261
[INFO] +271ms Global : -- RESULT: PASS
[INFO] +0ms Global :
[INFO] +0ms Global : ** PASSED **
[INFO] +0ms Global :
[INFO] +0ms Global : All systems go.
(Not sure why the error and then the pass?)
Log File: https://gist.github.com/CrossR/1ff0fc350a19119795d870905ddfd488
This is for esy run run
and then just Cmd-P and opening Oni2.re
. I scroll down a few lines and then it crashes.
Both of these were with a rm -rf _esy
and node install-node-deps.js
beforehand.
Thanks for the details, @CrossR ! Health-check looks good, and nothing obvious from the log.
I actually only get this behaviour in my dev build
Do you happen to have any local changes - like any resolutions or other changes?
For the crash:
I opened #1100 to add some CLI options to disable syntax highlighting / extensions as a way to rule out some particular issues.
Do you happen to have any local changes - like any resolutions or other changes?
Nope this is with a clean build of Oni2 with no changes. I could try and rm -rf _esy ~/.esy && esy
as I've currently only removed the local cache dir. I could also try a full re-clone.
I'm getting a beach-ball and then a hang, where it seems to wait indefinately. I'm assuming its the syntax highlighting that isn't happy, since I get no highlights in either Reason or JSON files, which should both have it.
EDIT: Running with --disable-syntax-highlighting
means I don't crash/hang at least.
Interesting... I do see this show up in your log file:
[ERROR] Oni2.SyntaxClient Syntax process closed with exit code: 33240
In this block:
[INFO] Oni2.StoreThread.dispatch (FileExplorer
(FocusNodeLoaded ("/Users/ryanc/git/oni2/src/bin_launcher", <opaque>)))
[INFO] Oni2.StoreThread.dispatch (BufferSetIndentation (2, <opaque>))
[INFO] Oni2.StoreThread.dispatch (SetTitle "Oni2.re - oni2 - Onivim 2")
[DEBUG] Oni2.StoreThread Running effect: Batch:
syntax.bufferUpdate
exthost.bufferUpdate
[DEBUG] Global [PERF] exthost.bufferUpdate took 0.000108957290649s
[INFO] Oni2.SyntaxServer Buffer enter - id: 2 filetype: reason
[INFO] Oni2.SyntaxServer No pending work.
[INFO] Oni2.StoreThread.dispatch (BufferSyntaxHighlights <opaque>)
[INFO] Oni2.SyntaxClient Tokens applied
[INFO] Oni2.SyntaxServer Token updates sent.
[INFO] Oni2.SyntaxServer Waiting for incoming message...
[INFO] Revery UI: BEGIN: Render frame
[DEBUG] Global [33241|bootstrap-fork.js]: %c INFO color: #33f ExtensionService#_doActivateExtension vscode.reason-vscode {"startup":false,"activationEvent":"onLanguage:reason"}
[INFO] Revery UI: END: Render frame
[ERROR] Oni2.SyntaxClient Syntax process closed with exit code: 33240
Something is indeed causing the syntax server process to crash / close. It's right after we send it an update and it sends us back tokens, and is waiting for the next round of work.
Based on the log, it seems to be crashing in between these log messages:
log("Running unit of work...");
map(State.doPendingWork);
log("Unit of work completed.");
since we see "Running unit of work..." but not "Unit of work completed."
One thing that might help - which arguments are you specifying when you run the build, @CrossR ? Which arguments / environment variables do you specify?
Might help me be able to repro it (we should also log these out... should be able to get these from the log)
One thing that might help - which arguments are you specifying when you run the build, @CrossR ? Which arguments / environment variables do you specify?
I build with esy
/esy bootstrap
and run with esy run run -f
and I've got the ONI2_DEBUG
and ONI2_LOG_FILE
vars set.
Thanks @CrossR ! Does it reproduce without the ONI2_DEBUG
and ONI2_LOG_FILE
vars set?
I'm thinking that it might be an issue where the syntax server process is logging out to the same file as the editor process.
I'm still getting this crash on master unfortunately.
The crash consistently happens on line 16/17 (of multiple files), and seems to come after a Protocol.ClientToServer.VisibleRangesChanged
is sent. I then just hang there. I don't necessarily think that notification is the cause, as its sent many times before reaching line 16/17.
Adding additional logs around that (i.e. to confirm it send and to confirm it is picked up) has it fail here:
A "Starting flush" log is printed, but no "Finished flush" after. Again though, this passes many times before it, so not sure what happens after 17 updates. That is, it happens consistently after 16/17 updates so 17 j
or a mix of j/k
s etc.
I tested again with the new flags and it works with either disabling the syntax highlights or the extension host. Could this be due to a conflict between the two running processes? Its also somewhat odd that when I Ctrl-c
to stop the editor when running with -f
the final bits of the log are as follows:
[DEBUG] +0ms Oni2.StoreThread : Running effect: Batch:
syntax.visibilityChange
[INFO] +0ms Oni2.SyntaxClient : Sending visibleRangesChanged notification...
[INFO] +0ms Oni2.SyntaxClient : Starting write...
[INFO] +0ms Oni2.SyntaxClient : Making channel...
[INFO] +0ms Oni2.SyntaxClient : Starting flush...
[DEBUG] +0ms Global : [69561|bootstrap-fork.js]: %cTRACE color: #888 [nullExtensionDescription] INVOKE provider 'DefinitionAdapter'
[DEBUG] +1ms Global : [69561|bootstrap-fork.js]: %cTRACE color: #888 [nullExtensionDescription] INVOKE provider 'DocumentHighlightAdapter'
[DEBUG] +1415ms Global : [69561|bootstrap-fork.js]: %cTRACE color: #888 [nullExtensionDescription] provider DONE after 2157ms
^C
[INFO] +93.65s Oni2.ExtensionClientStoreConnector : ext host closed
Which implies we are in the extension client (I think, I'd normally get the revery can quit log out on a ctrl-c).
https://gist.github.com/CrossR/285feb1093490540a00543b2b52ee661 - Full new log from clean master build and just run with esy run run -f
. Are the bootstrap-fork.js
errors anything to worry about either?
A "Starting flush" log is printed, but no "Finished flush" after.
Good observation! The reason this is hanging, though, is because earlier - the syntax highlighting service crashed earlier:
1703 [ERROR] Oni2.SyntaxClient Syntax process stopped with signal: -10
The last log we see from the syntax server is:
[INFO] Oni2.SyntaxServer Grammar loaded successfully
...so something is crashing it after.
Unfortunately, I don't think we have any more clues on why this process is crashing. I'll open a PR for you to run where maybe we can get some more fine-grained logging to get clues about this in particular.
Again though, this passes many times before it, so not sure what happens after 17 updates. That is, it happens consistently after 16/17 updates so 17 j or a mix of j/ks etc.
This is really interesting... it sounds like maybe some buffer is getting filled up.
I tested again with the new flags and it works with either disabling the syntax highlights or the extension host. Could this be due to a conflict between the two running processes
Also interesting... it could be possible, but I'm not sure what conflict would cause this. Some sort of deadlock?
The way the process exited was a Unix signal - I'm not sure what -10
corresponds to, though! There's a list of signals here: https://caml.inria.fr/pub/docs/manual-ocaml/libref/Sys.html (but it doesn't have the numeric values...). It'd be useful to know which signal is aborting the process. There are also OCaml utilities for handling signals like this
EDIT: Looks like -10
is a sigsegv
- a seg fault. I suspect this is some issue loading the textmate or oniguruma libraries.
A few more points of clarification:
esy create-release
)?git bisect
'ing for us to know the 'last known good state'?One other idea - it would be helpful to see if this reproduces via an integration test.
We could create a new test like 'RegressionTest1097', based on: https://github.com/onivim/oni2/blob/master/integration_test/SyntaxHighlightTextMateTest.re - but do a bunch of key updates prior to the verification of syntax highlights:
input("j");
input("k");
in a loop. In theory... that should reproduce it (and it would be interesting to see if it manifests on CI, or in my machines).
I'll open a PR to start this test - but might need you to see if it reproduces it / modify it to get it reproducible
Does this reproduce for you in the packaged builds?
Nope! The packaged builds do work for me with syntax highlights.
Does this reproduce for you in a build you package yourself (ie, esy create-release)?
It does repro in builds I make myself, implying my build is going wrong somewhere.
Which terminal do you use on OSX?
I use kitty normally, though it reproduces with Terminal as well. Both use bash from brew.
Was there any progress on git bisect'ing for us to know the 'last known good state'?
I did test some bisection all the way back to the async additions, and it repros before that. I'm testing a pretty far back commit now, just to check that its not something very wrong with my machine.
Interesting!
I suspect now that there is something broken / corrupted in reason-textmate
or oniguruma
libraries. I'm going to add health-checks for those libs, and some extra logging in the textmate tokenizer...
I suspect now that there is something broken / corrupted in reason-textmate or oniguruma libraries.
That seems reasonable! A quick test of esy @test run
over in reason-textmate
does fail for me:
┌─[17:39:41 MacBook (master) ~/git/reason-textmate]
└─╼ esy @test run
Running 9 test suites
PASS Trie
PASS TokenTheme
PASS ThemeScopes
RUNS OneDark============ BEGIN ===============
============ END ===============
PASS OneDark
PASS RegExpFactory
PASS Pattern
RUNS Grammar┌─[17:39:45 MacBook (master) ~/git/reason-textmate]
└─╼
Seems to crash over in Grammar
.
EDIT: It does fail on an Oniguarama line, so must be that.
Interesting - how about this lib? https://github.com/onivim/reason-oniguruma
From Discord - @CrossR narrowed it down to an issue with oniguruma
- likely related to this one:
https://github.com/kkos/oniguruma/issues/159
https://discordapp.com/channels/417774914645262338/618478669282934806/660594843059355658
Was fixed with bringing in the new version of oniguruma.
Thanks for the great investigation + fix, @CrossR !
Currently, when you open a file with syntax highlighting (or it may be the language server?) we crash and don't recover.
I've tested
JSON
files andReason
files and in both you get no syntax highlights at all, and then crash moments later. This is on the latest master, macOS Catalina, node 13 and esy 0.6.0.I'm nipping out for a bit, but I can do some bisection to work out what commit caused it since it happens on the released builds and the dev builds so its not a release artifact issue.