haskell / haskell-language-server

Official haskell ide support via language server (LSP). Successor of ghcide & haskell-ide-engine.
Apache License 2.0
2.7k stars 366 forks source link

The case of the mysterious `segfault` loop #2314

Closed GavinRay97 closed 2 years ago

GavinRay97 commented 2 years ago

I do not know anything about/write Haskell, but I have been trying to make the tooling and experience better for contributors and folks on our team.

Part of our Codebase is in Haskell. I wrote a development Dockerfile that reproducibly creates an environment with needed deps for our Haskell app -- but I am unable to get HLS to function properly in it ☹️

Probably user error, but taking an informal poll shows:

CLICK TO SHOW IMAGE πŸ‘‡ ![image](https://user-images.githubusercontent.com/26604994/139504577-28289dd5-d3c7-4fe6-bfd4-3789848c9408.png)

What happens is that it builds, and then gets stuck in a segfault loop.

To make this easy to reproduce, I've containerized everything -- you should be able to open it in your browser or use a VS Code Devcontainer locally to get an identical environment to the one that is broken.

Your environment

Output of haskell-language-server --probe-tools or haskell-language-server-wrapper --probe-tools:

@GavinRay97 ➜ /workspaces/graphql-engine (master βœ—) $ ~/.ghcup/bin/haskell-language-server-wrapper --probe-tools
haskell-language-server version: 1.4.0.0 (GHC: 8.10.4) (PATH: /home/codespace/.ghcup/bin/haskell-language-server-wrapper-1.4.0) (GIT hash: 253547816ee216c53ee7dacc0ad3cac43e863d30)
Tool versions found on the $PATH
cabal:          3.6.2.0
stack:          Not found
ghc:            8.10.2

Which OS do you use:

Which lsp-client do you use:

Describe your project (alternative: link to the project):


Steps to reproduce

CLICK TO SHOW INSTRUCTIONS πŸ‘‡ - [Without VS Code or Codespaces at all](#without-vs-code-or-codespaces-at-all) - [In a browser](#in-a-browser) - [In VS Code locally/offline](#in-vs-code-locallyoffline) - [In VS Code locally, but connected to a remote Codespace ("Thin Client")](#in-vs-code-locally-but-connected-to-a-remote-codespace-thin-client) - [In a text editor like `vim`, `emacs`, etc, connected to remote Codespace ("Thin client")](#in-a-text-editor-like-vim-emacs-etc-connected-to-remote-codespace-thin-client) #### Without VS Code or Codespaces at all - Use the Dockerfile at `.devcontainer/Dockerfile` - Set up a bind mount over `/graphql-engine`, and run the Docker image (either manually or with Compose) #### In a browser 1. Go here: https://github.com/GavinRay97/graphql-engine 2. Press "New Codespace", as in image below ![image](https://user-images.githubusercontent.com/26604994/139504842-8f22a253-8d4a-46f4-bf71-8bb92864e293.png) #### In VS Code locally/offline - Clone the repo from https://github.com/GavinRay97/graphql-engine - Open the repo in VS Code - Accept this prompt: - ![image](https://user-images.githubusercontent.com/26604994/139280467-7faaf126-4c07-4663-919b-6c890258d9d5.png) #### In VS Code locally, but connected to a remote Codespace ("Thin Client") - Install the `Codespaces` extension in VS Code - On the navigation panel thing, click the Remote Explorer icon (circled in red), then from the dropdown at the top (circled in red) select `Codespaces`, and either press `+` to create a new one or click to connect to an existing: - ![image](https://user-images.githubusercontent.com/26604994/139281133-3a9d6d4e-e567-4ddf-89fb-4ef3b278da0e.png) #### In a text editor like `vim`, `emacs`, etc, connected to remote Codespace ("Thin client"):** - Use the Github CLI's `ssh` command to connect to the codespace and then run it through your editor of choice - https://github.blog/changelog/2021-10-27-new-codespaces-features-launching-at-universe-2021/ - ![image](https://user-images.githubusercontent.com/26604994/139281744-7a905afc-3ddd-405b-8bab-01fc6e473df4.png)

Expected behaviour

It doesn't segfault, or instead of segfaulting it prints helpful debug info before dying (I have tried turning verbose logging on, no dice πŸ™)

Actual behaviour

It starts, segfaults at random (no pattern), and restarts itself, repeating the loop.

Include debug information

Execute in the root of your project the command haskell-language-server --debug . and paste the logs here:

Debug output: ``` ```

Paste the logs from the lsp-client, e.g. for VS Code

LSP logs: ``` ```
pepeiborra commented 2 years ago

Is HLS running inside the container or locally?

GavinRay97 commented 2 years ago

Hey, thanks for the response =D

HLS is running from inside the container -- everything has been kept in-container to be reproducible. The hope being once it works once, then it works forever 🀞 And everyone wanting a working setup can use that image or Devcontainer or Codespace.

pepeiborra commented 2 years ago

If the environment is reproducible, then I have no idea why HLS would segfault for only ~30% of the users.

GavinRay97 commented 2 years ago

Ahh -- maybe some miscommunication on my end there, sorry.

Traditionally, everyone has set up the project locally.

We don't have very extensive docs on how to do this. There are a lot of implicit apt libs needed, and specific versions, plus specific versions of GHC, etc.

So the onboarding/setup process for contributors and devs can be somewhat painful.

The Dev container I've posted here is an attempt to help that -- nobody is using it yet, though there is interest around it.

I am unable to get HLS working inside of the Dev container.

And I figured that starting from a reproducible container environment would make it much easier to talk about/debug this, since everyone can be on the same page πŸ™‚

If it's possible to get HLS working in this/a container setup, then everyone will have 100% HLS success rate that wishes to use it πŸŽ‰ πŸ₯³

GavinRay97 commented 2 years ago

Also feel free to tell me "sorry, you're on your own"/"not my problem", I wouldn't take it personally. I just felt like I had to at least give reaching out a shot, you know? πŸ˜…

pepeiborra commented 2 years ago

I don't have time to go through the contributing notes. Can you explain what is preventing HLS from working in the Dev container? Have you tried cabal install haskell-language-server?

GavinRay97 commented 2 years ago

I don't have time to go through the contributing notes.

No worries, that was more just an attempt to point out the motivation behind the container

Have you tried cabal install haskell-language-server?

I have haskell-language-server in ~/.ghcup/bin (not sure if this is the same effect? I know NOTHING about Haskell or it's ecosystem/tooling):

@GavinRay97 ➜ /workspaces/graphql-engine (master) $ ls ~/.ghcup/bin
cabal          ghci-8.10.2     haddock-8.10.2                        haskell-language-server-8.10.5        haskell-language-server-8.6.4~1.4.0  haskell-language-server-9.0.1          hpc            runghc-8.10
cabal-3.6.2.0  ghc-pkg         haskell-language-server-8.10.2        haskell-language-server-8.10.5~1.4.0  haskell-language-server-8.6.5        haskell-language-server-9.0.1~1.4.0    hpc-8.10       runghc-8.10.2
ghc            ghc-pkg-8.10    haskell-language-server-8.10.2~1.4.0  haskell-language-server-8.10.6        haskell-language-server-8.6.5~1.4.0  haskell-language-server-wrapper        hpc-8.10.2     runhaskell
ghc-8.10       ghc-pkg-8.10.2  haskell-language-server-8.10.3        haskell-language-server-8.10.6~1.4.0  haskell-language-server-8.8.3        haskell-language-server-wrapper-1.4.0  hsc2hs         runhaskell-8.10
ghc-8.10.2     ghcup           haskell-language-server-8.10.3~1.4.0  haskell-language-server-8.10.7        haskell-language-server-8.8.3~1.4.0  hp2ps                                  hsc2hs-8.10    runhaskell-8.10.2
ghci           haddock         haskell-language-server-8.10.4        haskell-language-server-8.10.7~1.4.0  haskell-language-server-8.8.4        hp2ps-8.10                             hsc2hs-8.10.2
ghci-8.10      haddock-8.10    haskell-language-server-8.10.4~1.4.0  haskell-language-server-8.6.4         haskell-language-server-8.8.4~1.4.0  hp2ps-8.10.2                           runghc
@GavinRay97 ➜ /workspaces/graphql-engine (master βœ—) $ ~/.ghcup/bin/haskell-language-server-wrapper --probe-tools
haskell-language-server version: 1.4.0.0 (GHC: 8.10.4) (PATH: /home/codespace/.ghcup/bin/haskell-language-server-wrapper-1.4.0) (GIT hash: 253547816ee216c53ee7dacc0ad3cac43e863d30)
Tool versions found on the $PATH
cabal:          3.6.2.0
stack:          Not found
ghc:            8.10.2

Can you explain what is preventing HLS from working in the Dev container?

Sure:

I have run with --debug and maximum verbosity, there's no apparent pattern or specific file.

When it segfaults, no errors/warnings are printed beforehand. It just terminates. Let me collect some logfiles for both the LSP in VS Code and the HLS binary, will upload here.

Here are comments from teammates, one mentions something about building it from source with some flags fixing it for him:

image

GavinRay97 commented 2 years ago
CLICK TO EXPAND LOGFILE DOWNLOAD LINKS πŸ‘‡ - Full HLS debug log - [haskell-language-server-logfile-debug-enabled.txt](https://github.com/haskell/haskell-language-server/files/7447221/haskell-language-server-logfile-debug-enabled.txt) - Stdout output while running HLS debug log collector - [haskell-language-server-run-output.txt](https://github.com/haskell/haskell-language-server/files/7447222/haskell-language-server-run-output.txt) - Output from first time running HLS VSCode LSP extension - [haskell-vscode-output-sample.txt](https://github.com/haskell/haskell-language-server/files/7447223/haskell-vscode-output-sample.txt) - Output from HLS VSCode LSP extension with binary set to `haskell-languge-server-8.10.2` but it ignores it - [haskell-vscode-output-with-manual-hls-binary.txt](https://github.com/haskell/haskell-language-server/files/7447225/haskell-vscode-output-with-manual-hls-binary.txt)

Okay, I have collected a lot of logs, and also noticed some behavior:

// .vscode/settings.json
{
    "haskell.logFile": "/workspaces/graphql-engine/haskell-vscode-logs.txt",
    "haskell.trace.client": "debug",
    "haskell.trace.server": "messages",
    "haskell.serverExecutablePath": "~/.ghcup/bin/haskell-language-server-8.10.2"
}

Here is relevant output from first startup of VS Code HLS in the container:

[client][INFO] Searching for server executables haskell-language-server-wrapper,haskell-language-server in $PATH
[client][INFO] Downloading haskell-language-server
[client][INFO] Fetching the latest release from GitHub or from cache
[client][INFO] The latest release is 1.4.0
[client][INFO] Figure out the ghc version to use or advertise an installation link for missing components
[client][INFO] Working out the project GHC version. This might take a while...
[client][INFO] Executing '/home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-wrapper-1.4.0-linux --project-ghc-version' in cwd '/workspaces/graphql-engine' to get the project or file ghc version
[client][INFO] Execution of '/home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-wrapper-1.4.0-linux --project-ghc-version' terminated with code 0
[client][INFO] The GHC version for the project or file: 8.6.5
[client][INFO] Search for binary haskell-language-server-Linux-8.6.5 in release assests
[client][INFO] Downloading haskell-language-server 1.4.0 for GHC 8.6.5
[client][INFO] Activating the language server in the workspace folder: /workspaces/graphql-engine
[client][INFO] run command: /home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-1.4.0-linux-8.6.5 --lsp -d -l ~/haskell-vscode-logs
[client][INFO] debug command: /home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-1.4.0-linux-8.6.5 --lsp -d -l ~/haskell-vscode-logs
[client][INFO] document selector patten: /workspaces/graphql-engine/**/*
[client][INFO] Starting language server
haskell-language-server version: 1.4.0.0 (GHC: 8.6.5) (PATH: /home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-1.4.0-linux-8.6.5) (GIT hash: 253547816ee216c53ee7dacc0ad3cac43e863d30)
Couldnt open log file ~/haskell-vscode-logs; falling back to stderr loggingStarting (haskell-language-server)LSP server...
  with arguments: GhcideArguments {argsCommand = LSP, argsCwd = Nothing, argsShakeProfiling = Nothing, argsTesting = False, argsExamplePlugin = False, argsDebugOn = True, argsLogFile = Just "~/haskell-vscode-logs", argsThreads = 0, argsProjectGhcVersion = False}
  with plugins: [PluginId "pragmas",PluginId "floskell",PluginId "fourmolu",PluginId "tactics",PluginId "ormolu",PluginId "stylish-haskell",PluginId "retrie",PluginId "brittany",PluginId "callHierarchy",PluginId "class",PluginId "haddockComments",PluginId "eval",PluginId "importLens",PluginId "refineImports",PluginId "moduleName",PluginId "hlint",PluginId "splice",PluginId "ghcide-hover-and-symbols",PluginId "ghcide-code-actions-imports-exports",PluginId "ghcide-code-actions-type-signatures",PluginId "ghcide-code-actions-bindings",PluginId "ghcide-code-actions-fill-holes",PluginId "ghcide-completions",PluginId "ghcide-type-lenses",PluginId "ghcide-core"]
  in directory: /workspaces/graphql-engine
 Starting LSP server...
If you are seeing this in a terminal, you probably should have run WITHOUT the --lsp option!
Started LSP server in 0.00s
setInitialDynFlags cradle: Cradle {cradleRootDir = "/workspaces/graphql-engine", cradleOptsProg = CradleAction: Default}
@GavinRay97 ➜ /workspaces/graphql-engine (master βœ—) $ ls /bin | grep ghc
ghc
ghc-8.6.5
ghci
ghci-8.6.5
ghc-pkg
ghc-pkg-8.6.5
haddock-ghc-8.6.5
runghc
runghc-8.6.5

Here is the output of both HLS with --debug, and the LSP startup. (Important bit here seems to be this as the last line, but I'm not sure)

Output to stdout from running haskell-language-server-wrapper --debug --logfile <file> .:

CLICK TO EXPAND πŸ‘‡ ```ps1 @GavinRay97 ➜ /workspaces/graphql-engine (master βœ—) $ haskell-language-server-wrapper --debug --logfile ./haskell-language-server-logfile-debug-enabled.txt . No 'hie.yaml' found. Try to discover the project type! Run entered for haskell-language-server-wrapper(haskell-language-server-wrapper) Version 1.4.0.0, Git revision 253547816ee216c53ee7dacc0ad3cac43e863d30 (dirty) x86_64 ghc-8.10.4 Current directory: /workspaces/graphql-engine Operating system: linux Arguments: ["--debug","--logfile","./haskell-language-server-logfile-debug-enabled.txt","."] Cradle directory: /workspaces/graphql-engine Cradle type: Cabal Tool versions found on the $PATH cabal: 3.6.2.0 stack: Not found ghc: 8.10.2 Consulting the cradle to get project GHC version... Project GHC version: 8.10.2 haskell-language-server exe candidates: ["haskell-language-server-8.10.2","haskell-language-server"] Launching haskell-language-server exe at:/home/codespace/.ghcup/bin/haskell-language-server-8.10.2 haskell-language-server version: 1.4.0.0 (GHC: 8.10.2) (PATH: /home/codespace/.ghcup/bin/haskell-language-server-8.10.2~1.4.0) (GIT hash: 253547816ee216c53ee7dacc0ad3cac43e863d30) ghcide setup tester in /workspaces/graphql-engine. Report bugs at https://github.com/haskell/haskell-language-server/issues Step 1/4: Finding files to test in /workspaces/graphql-engine Found 391 files Step 2/4: Looking for hie.yaml files that control setup Found 1 cradle () Step 3/4: Initializing the IDE Step 4/4: Type checking the files Output from setting up the cradle Cradle {cradleRootDir = "/workspaces/graphql-engine", cradleOptsProg = CradleAction: Cabal} COMMON symbol, size 96 name batch_point_buffer allocated at 0x419ea000 haskell-language-server-wrapper: callProcess: /home/codespace/.ghcup/bin/haskell-language-server-8.10.2 "--debug" "--logfile" "./haskell-language-server-logfile-debug-enabled.txt" "." (exit -11): failed ```
COMMON symbol, size 96 name batch_point_buffer allocated at 0x419ea000
haskell-language-server-wrapper: callProcess: /home/codespace/.ghcup/bin/haskell-language-server-8.10.2 "--debug" "--logfile" "./haskell-language-server-logfile-debug-enabled.txt" "." (exit -11): failed

Also I get this, which I think again might be related to something about the VSC extension giving preference to /bin instead of the project GHC version? πŸ€”

image


image

@GavinRay97 ➜ /workspaces/graphql-engine (master βœ—) $ cabal --version
cabal-install version 3.6.2.0
compiled using version 3.6.2.0 of the Cabal library 

@GavinRay97 ➜ /workspaces/graphql-engine (master βœ—) $ file /home/codespace/.ghcup/bin/cabal
/home/codespace/.ghcup/bin/cabal: symbolic link to cabal-3.6.2.0

@GavinRay97 ➜ /workspaces/graphql-engine (master βœ—) $ file /home/codespace/.ghcup/bin/cabal-3.6.2.0
/home/codespace/.ghcup/bin/cabal-3.6.2.0: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped

@GavinRay97 ➜ /workspaces/graphql-engine (master βœ—) $ whoami
codespace

@GavinRay97 ➜ /workspaces/graphql-engine (master βœ—) $ stat /home/codespace/.ghcup/bin/cabal-3.6.2.0
  File: /home/codespace/.ghcup/bin/cabal-3.6.2.0
  Size: 31840440        Blocks: 62192      IO Block: 4096   regular file
Device: 32h/50d Inode: 1310957     Links: 1
Access: (0755/-rwxr-xr-x)  Uid: ( 1000/codespace)   Gid: ( 1000/codespace)
Access: 2021-10-30 15:29:55.000000000 +0000
Modify: 2021-10-30 15:29:55.000000000 +0000
Change: 2021-10-30 15:30:43.062521383 +0000
 Birth: -

setInitialDynFlags cradle: Cradle {cradleRootDir = "/workspaces/graphql-engine", cradleOptsProg = CradleAction: Default}

Couldnt load cradle for libdir: (CradleError {cradleErrorDependencies = [], cradleErrorExitCode = ExitSuccess, cradleErrorStderr = ["Couldn't execute ghc --print-libdir"]},"/workspaces/graphql-engine",Nothing,Cradle {cradleRootDir = "/workspaces/graphql-engine", cradleOptsProg = CradleAction: Default})
@GavinRay97 ➜ /workspaces/graphql-engine (master βœ—) $ ghc --print-libdir
/home/codespace/.ghcup/ghc/8.10.2/lib/ghc-8.10.2

Example segfault when running haskell-language-server-wrapper:

File:     /workspaces/graphql-engine/server/src-lib/Hasura/Backends/Postgres/Connection.hs
Hidden:   no
Range:    382:26-382:54
Source:   hlint
Severity: DsInfo
Message:  Redundant bracketFound:(object ["from_env" .= var])Why not:object ["from_env" .= var]
File:     /workspaces/graphql-engine/server/src-lib/Hasura/Backends/Postgres/Connection.hs
Hidden:   no
Range:    408:61-408:85
Source:   hlint
Severity: DsInfo
Message:  Redundant bracketFound:f <$> (pgcSslPassword pgCerts)Why not:f <$> pgcSslPassword pgCerts
Segmentation fault (core dumped)
jneira commented 2 years ago

Thanks for the detailed bug report. To help myself understand the issue, there are two problems:

About the first one, it seems to me that the env for the vscode extension and the env in the shell is not the same. Maybe it is due they are using different profile files setting the PATH. The cli usually uses .bashrc and the graphical env where vscode is launched maybe is using /etc/profile. So i would try to double check it and source .bashrc in /etc/profile if that is the problem. The extension just run hls-wrapper --project-ghc-version as you did in the cli, but it seems that execution within the extension is returning the default system ghc, without taking in account ghcup. So it drives me to think ghcup is not in PATH for the vscode gui. Also the fact cabal is not being found, etc, etc.

The problem seems the reported one here: https://github.com/haskell/haskell-language-server/issues/236

I am gonna add debug statements about the env vars, specially the PATH, in the vscode extension, to help trace those kind of issues.

About the second one: it is unfortunate that hls crashes with no further info and that is something we have to fix. But i would try to disable all plugins, specially hlint, as i see lot of warnings emitted by that plugin. Other problematic plugins could be eval and tactics. Then, if it works without any plugin enabled, i would enable them until you get the offending one.

The full config to disable all plugins is here: https://github.com/haskell/haskell-language-server/issues/2151#issuecomment-911397030

jneira commented 2 years ago

Approx ~30% of our Haskell devs are unable to get HLS working

I am curious about that: the problems of those devs without using docker (i suppose) are related with the problems you are getting using docker? do they get random crashes as well? It would be a signal the project itself could have some charateristic which triggers the bug

jneira commented 2 years ago

Also the use of template haskell usually is the cause of segfaults, have you identified if HLS crashes when loading modules using it (or with dependant modules using it)

GavinRay97 commented 2 years ago

Apologies for the delayed response @jneira!

Thanks for the reply -- and you're absolutely right about it being two separate issues. The VS Code ENV/PATH thing makes a lot of sense, since this doesn't happen when running the wrapper binary directly.

But i would try to disable all plugins, specially hlint, as i see lot of warnings emitted by that plugin. Other problematic plugins could be eval and tactics. Then, if it works without any plugin enabled, i would enable them until you get the offending one.

The full config to disable all plugins is here: #2151 (comment)

Got it -- I should have thought to try with plugins disabled (I noticed quite a number of them are enabled when the log starts) so that's a good idea. Can go through this systematically, disabling all, and seeing if your suspected extensions cause the crash after some time

Also the use of template haskell usually is the cause of segfaults, have you identified if HLS crashes when loading modules using it (or with dependant modules using it)

Somewhat embarassingly, I do not know enough about Haskell to be able to give you a great answer to this. I know more about setting up Haskell build tooling and dev environments than I do the language! πŸ˜…

Am pretty sure we DO use Template Haskell, have heard it mentioned before. I could tag some of my colleagues here as well if it would be helpful.

Quick search reveals (at least these):

-- | This module defines all basic Template Haskell functions we use in the rest
-- of this folder, to generate code that deals with all possible known
-- backends.
--
-- Those are all "normal" Haskell functions in the @Q@ monad: they deal with
-- values that represent Haskell code. Those functions are used, in other
-- modules, within Template Haskell splices.

And anecdotally, I believe HLS would segfault more often in areas related to our DB backend + SQL gen stuff. So that would line up with what you're saying.

I am curious about that: the problems of those devs without using docker (i suppose) are related with the problems you are getting using docker? do they get random crashes as well? It would be a signal the project itself could have some charateristic which triggers the bug

Yes, none of them use Docker-based environments AFAIK. The majority are on Linux, with some on Macbooks. The segfaults seem to be an issue primarily for the devs on Linux.

image

The distribution of the Haskell engineers OS-wise is something like: (Ref: https://user-images.githubusercontent.com/26604994/139504577-28289dd5-d3c7-4fe6-bfd4-3789848c9408.png) OS Percentage
Ubuntu/Debian 33%
MacOS 22%
Arch 16%
NixOS or Other 28%

What do you make of this line/what does this "mean"?

image

Not sure if it should impact anything, but we link/use a decent number of C libraries during the build.

Something like:

        libpq-dev libssl-dev postgresql-client-${postgres_ver}
        postgresql-client-common
        unixodbc-dev freetds-dev
        default-libmysqlclient-dev libpcre3-dev libkrb5-dev 
jneira commented 2 years ago

What do you make of this line/what does this "mean"?

It is referring to a workaround for template haskell problems wihch consists in get a haskell-language-server binary building it from source instead use a prebuilt binary. The built should use the option -dynamic for ghc, the haskell compiler. There are several ways to do it but you can consult here: https://github.com/haskell/haskell-language-server/issues/1431#issuecomment-948478415 The problem is not directly related with c libraries used.

Am pretty sure we DO use Template Haskell, have heard it mentioned before. I could tag some of my colleagues here as well if it would be helpful.

Template haskell is a way to add "macros" to the language, to write code that generates code at compile time. I would bet there is the direct cause of the segfaults in your environment. So use a custom hls executable built with -dynamic might help

o1lo01ol1o commented 2 years ago

The workaround using -dynamic is at least used to address a problem with the Darwin linker and template haskell builds on Catalina. It may be used for other issues as well, but that has been my experience.

(And, fwiw, simply using the latest HLS from nixpkgs-2105 on my Darwin machine fixed my template Haskell crashes. I don’t know what your infra is like, and doubt the suggestion of β€œuse nix” is super helpful, but if there’s a project amenable to Haskell.nix or similar, it might help differential diagnostics. And if it works, I believe you can also build a docker container from that derivation fairly painlessly.)

On Nov 7, 2021, at 8:28 PM, Gavin Ray @.***> wrote:

ο»Ώ Apologies for the delayed response @jneira!

Thanks for the reply -- and you're absolutely right about it being two separate issues. The VS Code ENV/PATH thing makes a lot of sense, since this doesn't happen when running the wrapper binary directly.

But i would try to disable all plugins, specially hlint, as i see lot of warnings emitted by that plugin. Other problematic plugins could be eval and tactics. Then, if it works without any plugin enabled, i would enable them until you get the offending one.

The full config to disable all plugins is here: #2151 (comment)

Got it -- I should have thought to try with plugins disabled (I noticed quite a number of them are enabled when the log starts) so that's a good idea. Can go through this systematically, disabling all, and seeing if your suspected extensions cause the crash after some time

Also the use of template haskell usually is the cause of segfaults, have you identified if HLS crashes when loading modules using it (or with dependant modules using it)

Somewhat embarassingly, I do not know enough about Haskell to be able to give you a great answer to this. I know more about setting up Haskell build tooling and dev environments than I do the language! πŸ˜…

Am pretty sure we DO use Template Haskell, have heard it mentioned before. I could tag some of my colleagues here as well if it would be helpful.

Quick search reveals (at least these):

https://github.com/hasura/graphql-engine/blob/11a454c2d69bb05c3471be0d04d2282cc93a557e/server/src-lib/Hasura/SQL/TH.hs#L1-L7 -- | This module defines all basic Template Haskell functions we use in the rest -- of this folder, to generate code that deals with all possible known -- backends.

-- Those are all "normal" Haskell functions in the @Q@ monad: they deal with -- values that represent Haskell code. Those functions are used, in other -- modules, within Template Haskell splices. https://github.com/hasura/graphql-engine/blob/11a454c2d69bb05c3471be0d04d2282cc93a557e/server/src-lib/Hasura/SQL/Tag.hs#L16-L25 -- | A singleton-like GADT that associates a tag to each backend. -- It is generated with Template Haskell for each 'Backend'. Its -- declaration results in the following type:

-- data BackendTag (b :: BackendType) where -- PostgresVanillaTag :: BackendTag ('Postgres 'Vanilla) -- PostgresCitusTag :: BackendTag ('Postgres 'Citus) -- MSSQLTag :: BackendTag 'MSSQL -- ... $( let name = mkName "BackendTag" And anecdotally, I believe HLS would segfault more often in areas related to our DB backend + SQL gen stuff. So that would line up with what you're saying.

I am curious about that: the problems of those devs without using docker (i suppose) are related with the problems you are getting using docker? do they get random crashes as well? It would be a signal the project itself could have some charateristic which triggers the bug

Yes, none of them use Docker-based environments AFAIK. The majority are on Linux, with some on Macbooks. The segfaults seem to be an issue primarily for the devs on Linux.

The distribution of the Haskell engineers OS-wise is something like: (Ref: https://user-images.githubusercontent.com/26604994/139504577-28289dd5-d3c7-4fe6-bfd4-3789848c9408.png)

OS Percentage Ubuntu/Debian 33% MacOS 22% Arch 16% NixOS or Other 28% What do you make of this line/what does this "mean"?

β€” You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

GavinRay97 commented 2 years ago

It is referring to a workaround for template haskell problems wihch consists in get a haskell-language-server binary building it from source instead use a prebuilt binary. The built should use the option -dynamic for ghc, the haskell compiler. There are several ways to do it but you can consult here: #1431 (comment)

Ahh okay, understood -- thank you! I will build a Linux AMD64 binary with that flag following the comments in the issue and see if that makes a difference, in addition to systematically working through enabled plugins.

Template haskell is a way to add "macros" to the language, to write code that generates code at compile time. I would bet there is the direct cause of the segfaults in your environment. So use a custom hls executable built with -dynamic might help

Brilliant! Well, that's a great lead to follow. Any ideas from an implementors point of view why HLS might be struggling with it -- or is that the Ten Million Dollar question we're all asking? πŸ˜…


@o1lo01ol1o Would seem like this -dynamic thing is certainly worth a shot then.

(And, fwiw, simply using the latest HLS from nixpkgs-2105 on my Darwin machine fixed my template Haskell crashes. I don’t know what your infra is like, and doubt the suggestion of β€œuse nix” is super helpful, but if there’s a project amenable to Haskell.nix or similar, it might help differential diagnostics. And if it works, I believe you can also build a docker container from that derivation fairly painlessly.)

I don't know much ABOUT Nix, but I am a fan in theory of Nix/Guix (Guix seems easier syntactically, Nix lang is a bit hard to follow IMO)

But a quick google leads to this:

And it turns out a colleague has also written this, which I found during the same google:

I think many folks on our team already use Nix, so it may be something worth investigating

jneira commented 2 years ago

could we state this would be related with template haskell as well?

jneira commented 2 years ago

I am gonna close this issue as all compiler crashes seems to have the same root cause:

If any of you think the issue should not be included generically feel free to reopen it (with a brief explanation if possible) Thanks all!