haskell / haskell-language-server

Official haskell ide support via language server (LSP). Successor of ghcide & haskell-ide-engine.
Apache License 2.0
2.66k stars 355 forks source link

More flaky tests (mainly in windows) #2153

Open jneira opened 3 years ago

jneira commented 3 years ago
[ 7 of 61] Compiling Development.IDE.Session.VersionCheck ( session-loader\Development\IDE\Session\VersionCheck.hs, D:\a\haskell-language-server\haskell-language-server\dist-newstyle\build\x86_64-windows\ghc-8.10.2\ghcide-1.4.1.0\build\Development\IDE\Session\VersionCheck.o )

Access violation in generated code when reading 0x31d0

 Attempting to reconstruct a stack trace...

   Frame    Code address
 * 0x48fda90    0x399be6d C:\ProgramData\chocolatey\lib\ghc.8.10.2.2\tools\ghc-8.10.2\bin\ghc.exe+0x359be6d
 * 0x48fda98    0x33c92ca C:\ProgramData\chocolatey\lib\ghc.8.10.2.2\tools\ghc-8.10.2\bin\ghc.exe+0x2fc92ca
 * 0x48fdaa0    0x16
 * 0x48fdaa8    0x8a507f9
 * 0x48fdab0    0x7ff81edfb7e0 C:\Windows\System32\KERNEL32.DLL+0x1b7e0
 * 0x48fdab8    0x90c
C:\Users\RUNNER~1\AppData\Local\Temp\ghcAD66.o.tmp: renameFile:renamePath:MoveFileEx "\\\\?\\C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\ghcAD66.o.tmp" Just "\\\\?\\C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\ghcAD66.o": does not exist (The system cannot find the file specified.)
Preprocessing library for hls-graph-1.4.0.0..
Building library for hls-graph-1.4.0.0..
cabal.exe: Failed to build hie-compat-0.2.1.0 (which is required by test:tests
from hls-eval-plugin-1.1.2.0). The failure occurred during the configure step.
jneira commented 3 years ago
* for windows and ghc-8.10.2.2 (i am gonna disable this one soon)

disabled with #2154

jneira commented 3 years ago

https://github.com/haskell/haskell-language-server/runs/3505220254?check_suite_focus=true

  cradle
    dependencies
      session-deps-are-picked-up:                                                                         FAIL
        Exception: Timed out waiting to receive a message from the server.
        Last message received:
        {
            "method": "$/progress",
            "params": {
                "value": {
                    "kind": "end"
                },
                "token": "14"
            },
            "jsonrpc": "2.0"
        }
        Use -p '/session-deps-are-picked-up/' to rerun this test only.
jneira commented 2 years ago

Lets see if the amazing work fixing flaky tests in #2243 fix some of those ones

jneira commented 2 years ago

This one is in ubuntu and hls-eva-plugin:

 FAIL (1.16s)
    Test output was different from 'test/testdata/THaddock.expected.hs'. Output of ["git","-c","core.fileMode=false","diff","--no-index","--text","--exit-code","test/testdata/THaddock.expected.hs","/tmp/THaddock.expected82807-25.actual"]:
    diff --git a/test/testdata/THaddock.expected.hs b/tmp/THaddock.expected82807-25.actual
    index 222bedf..b879e41 100644
    --- a/test/testdata/THaddock.expected.hs
    +++ b/tmp/THaddock.expected82807-25.actual
    @@ -22,7 +22,17 @@ double :: Num a => a -> a
     double a = a + a
     -- ^ Single line backward comments
     -- >>> double 11
    --- 22
    +-- <BLANKLINE>
    +-- GHC.ByteCode.Linker.lookupCE
    +-- During interactive linking, GHCi couldn't find the following symbol:
    +--   interactive_Ghci1_evalPrint_closure
    +-- This may be due to you not asking GHCi to load extra object files,
    +-- archives or DLLs needed by your current session.  Restart GHCi, specifying
    +-- the missing library using the -L/path/to/object/dir and -lmissinglibname
    +-- flags, or simply by naming the relevant files on the GHCi command line.
    +-- Alternatively, this link failure might indicate a bug in GHCi.
    +-- If you suspect the latter, please report this as a GHC bug:
    +--   https://www.haskell.org/ghc/reportabug

     twice :: [a] -> [a]
     twice a <truncated>
    Use --accept or increase --size-cutoff to see full output.
    Use -p '/Evaluate expressions in Haddock comments in both single line and multi line format/' to rerun this test only.
jneira commented 2 years ago
 Failed to build lsp-1.2.0.1. The failure occurred during the configure step.
Build log (
C:\Users\runneradmin\AppData\Roaming\cabal\logs\ghc-9.0.1\lsp-1.2.0.1-381c3d12187ac032fd2f3e5c5e55a4b28ee306b6.log
cabal.exe: Failed to build lsp-1.2.0.1 (which is required by test:wrapper-test
):
from hls-1.4.0.0 and test:func-test from hls-1.4.0.0). See the build log above
Configuring library for lsp-1.2.0.1..
for details.
C:\Users\RUNNER~1\AppData\Local\Temp\ghcAAE5.o.tmp: renameFile:renamePath:MoveFileEx "\\\\?\\C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\ghcAAE5.o.tmp" Just "\\\\?\\C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\ghcAAE5.o": does not exist (The system cannot find the file specified.)
jneira commented 2 years ago

From #2381

eddiemundo commented 2 years ago

Summary of the "lower-case drive" test on windows ghc 9.0.1 from reading test and logs:

So in the end we want to recompile A/A.hs but it doesn't actually exist on disk. Don't know why this happens only on windows, and why it only sometimes happens, but I guess the difference is ghcide/GC happening non-deterministically.

Maybe writing contentsA to A/A.hs in the filesystem in the test would solve this problem, but I don't know if that's necessary or right.

pepeiborra commented 2 years ago

Thanks for the analysis!

After closing A.hs the set of files of interest is [], therefore the next build should not be doing much.

But there can be still targets in the build queue from a previous build, which require A.hs. In that case, the GC is happening prematurely and should be delayed until the build queue is empty. I'll add some logging to confirm.

pepeiborra commented 2 years ago

I just found that the "lower-case drive" test can fail with 0 keys GC'ed, which suggests that the problem is not GC related.

Instead, I think the problem might be the "InitialLoad" build target that type checks all the project files at startup. The event sequence is as follows:

  1. A.hs is opened and added to the KnownTargets set
  2. A new component is created for A.hs and an "InitialLoad" build is fired
  3. A.hs is type checked as part of the "InitialLoad"
  4. A.hs is closed

The ordering of 3 and 4 is non deterministic. If 4 happens before 3, then the test will crash since A.hs does not exist on disk.

We should probably disable this option (checkStartup) in the whole testsuite

jneira commented 2 years ago

hope other tests don't rely in the initial load to work... could not we make 3 and 4 deterministic?, close the file should not interrupt whatever the server is doing with the file?

pepeiborra commented 2 years ago

If there are tests that rely on the initial load, we should stop them from doing so.

Do you have any suggestions on how to make 3 and 4 deterministic? I don't see how.

jneira commented 2 years ago

Do you have any suggestions on how to make 3 and 4 deterministic? I don't see how.

Are we not already cancelling processing using async exceptions? could we not cancell the typechecking when a close file event is send by the client or a file watcher? Or the client close effectively the file without waiting to any clean up in the server?

sorry if i am missing something obvious

pepeiborra commented 2 years ago

Do you have any suggestions on how to make 3 and 4 deterministic? I don't see how.

Are we not already cancelling processing using async exceptions? could we not cancell the typechecking when a close file event is send by the client or a file watcher? Or the client close effectively the file without waiting to any clean up in the server?

sorry if i am missing something obvious

It's a bit more subtle than that. Action computations are cancelled when the build is restarted yes, but then they are re-enqueued to run on the next build. This is by design and it means that there is no simple way to cancel the initial load. It would not be possible to filter the build queue when a file gets closed because the actions in the queue are not indexed by a NormalizedFilePath.

In any case the contract for checkProject is to check all the files on startup, regardless of whether a file is open or closed in the editor. So I'm not convinced that cancellation is the right approach here.

jneira commented 2 years ago

We have to analyze current behaviour, it seems tests are not so flaky nowadays

jneira commented 2 years ago

closing as those flaky tests are not present or they are not throwed repeteadly

soulomoon commented 4 months ago

Still seeing them