haskell / haskell-language-server

Official haskell ide support via language server (LSP). Successor of ghcide & haskell-ide-engine.
Apache License 2.0
2.66k stars 355 forks source link

Non-deterministic CI test failures #1430

Closed Ailrun closed 3 years ago

Ailrun commented 3 years ago

Recently, test CI starts to fail because of

Exception: Language server unexpectedly terminated

or

Exception: fd:6: hPutBuf: resource vanished (Broken pipe)

in a non-deterministic manner.

I suspect that this issue has been introduced with lsp-1.0 update (it started to happen since then, if my memory is correct), but haven't found any obvious clues.

jneira commented 3 years ago

It seems they are more frequent in windows, i remember to fix similar errors in lsp-test some time ago, which process handling was assuming a linux system

Ailrun commented 3 years ago

One more error (occurs in Ubuntu):

    test/src/Development/IDE/Test.hs:77:
    Got unexpected diagnostics for Uri {getUri = "file:///tmp/extra-dir-35157274894787/KnownNat.hs"} got List [Diagnostic {_range = Range {_start = Position {_line = 9, _character = 15}, _end = Position {_line = 9, _character = 16}}, _severity = Just DsError, _code = Just (InR "-Wdeferred-out-of-scope-variables"), _source = Just "typecheck", _message = "Variable not in scope: c :: Int", _tags = Nothing, _relatedInformation = Nothing}]
jneira commented 3 years ago

The test is

ghcide
  parsedResultAction plugin: FAIL (3.85s)
    test/src/Development/IDE/Test.hs:78:
    Got unexpected diagnostics for Uri {getUri = "file:///tmp/extra-dir-29972983178883/KnownNat.hs"} got List [Diagnostic {_range = Range {_start = Position {_line = 9, _character = 15}, _end = Position {_line = 9, _character = 16}}, _severity = Just DsError, _code = Just (InR "-Wdeferred-out-of-scope-variables"), _source = Just "typecheck", _message = "Variable not in scope: c :: Int", _tags = Nothing, _relatedInformation = Nothing}]
jneira commented 3 years ago

ci results are highly inestable lately, due to the mentioned false negatives and errors in the bench job :worried:

Ailrun commented 3 years ago

It looks like testSessionWithExtraFiles somehow doesn't use a clean new session. The failed test with KnownNat error does not use that file actually... CC: @wz1000

wz1000 commented 3 years ago

The cabal file for that test lists both KnownNat and RecordDot as part of the component.

library
  build-depends: base, ghc-typelits-knownnat, record-dot-preprocessor,
          record-hasfield
  exposed-modules: KnownNat, RecordDot
  hs-source-dirs: .

Obviously this leads to the non-deterministic failures observed. Someone should split this up into two cabal projects.

pepeiborra commented 3 years ago

What about this one:

    type-definition
      Saturated data con:                                                                                 FAIL
        Exception: fd:71: hPutBuf: resource vanished (Broken pipe)
      Polymorphic variable:                                                                               FAIL
        Exception: fd:107: hPutBuf: resource vanished (Broken pipe)
  simple plugin:                                                                                          ghcide-tests: Maybe.fromJust: Nothing
CallStack (from HasCallStack):
  error, called at libraries/base/Data/Maybe.hs:148:21 in base:Data.Maybe
  fromJust, called at src/Language/LSP/Test/Decoding.hs:87:15 in lsp-test-0.13.0.0-f38a659f3ed5e1ec1cde1ce88a73f023eeed40e148f97626f31f53fa01ea14e5:Language.LSP.Test.Decoding
wz1000 commented 3 years ago

Do you have the logs from that?

Ailrun commented 3 years ago

It's not fully resolved yet.

berberman commented 3 years ago

I observed a new one, which happens randomly in many test cases

https://paste.xinu.at/TtbZ/ https://github.com/haskell/haskell-language-server/runs/2127845970

Received an illegal message between the initialize request and response:
{
    "id": 0,
    "method": "window/workDoneProgress/create",
    "params": {
        "token": "5"
    },
    "jsonrpc": "2.0"
}
pepeiborra commented 3 years ago

After landing #1651 we now see some tests randomly failing with the error below, potentially impacting all tests and OS (although so far I have only observed it in Windows):

Error:  Fatal error in server thread: SQLite3 returned ErrorBusy while attempting to perform step: database is locked
3083
****

https://github.com/haskell/haskell-language-server/pull/1667/checks?check_run_id=2269170992

@wz1000 volunteered to look into this eventually. @wz1000 is there any workaround we can apply in the meantime?

jneira commented 3 years ago

It seems those failures are gone for now, we can open a new one if they strikes back