propensive / fury-old

A new build tool for JVM languages
Apache License 2.0
413 stars 34 forks source link

first run of Scala 3 compiler sometimes fails #819

Open propensive opened 4 years ago

propensive commented 4 years ago

Output on OSX revealed a stack trace:

java.util.concurrent.ExecutionException: org.eclipse.lsp4j.jsonrpc.ResponseErrorException: expected whitespace or eof got u (line 1, column 2316)
    at org.eclipse.lsp4j.jsonrpc.RemoteEndpoint.handleResponse(RemoteEndpoint.java:209)
    at org.eclipse.lsp4j.jsonrpc.RemoteEndpoint.consume(RemoteEndpoint.java:193)
    at org.eclipse.lsp4j.jsonrpc.json.StreamMessageProducer.handleMessage(StreamMessageProducer.java:192)
    at org.eclipse.lsp4j.jsonrpc.json.StreamMessageProducer.listen(StreamMessageProducer.java:94)
    at org.eclipse.lsp4j.jsonrpc.json.ConcurrentMessageProcessor.run(ConcurrentMessageProcessor.java:99)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
...

which suggests that sometimes there is additional output being sent after some JSON. Given that it reached character 2316, that this was the first parse error, and the only further input expected from this point onwards is basically termination, it suggests that the JSON messages are being sent fine, and that something else (always, it seems, beginning with a u character) is sent afterwards.

Update: Sometimes have a failure running the Scala 3 integration test, but it happens about 10-20% of the CI runs. CI runs in a clean environment, so when Bloop is presented with a request to compile something for Scala 3, it knows that it has to first compile the "compiler interface" for that version of Scala 3. It usually completes this compilation in the background, caches the result, and then uses the compiled compiler interface to communicate with the Scala 3 compiler and compile whatever we actually asked it to compile.

Unfortunately, there seems to be a race-condition in Bloop which results in some error. Maybe the real compilation is triggered too soon after the compiler interface finishes, and the compiler interface's class files end up in the wrong place when attempting to start the requested compilation.

In any case, compiling twice works. We actually used this hack in the integration test, but it was removed because this ought to work. So the Scala 3 test should work every time after the first compilation (whether that first compilation succeeds or fails).

In order to diagnose the problem, we would need to set up a test rig which repeatedly runs the Scala 3 test, then deletes the compiler interface and restarts Bloop, before running it again... about 50 times to get a good idea about whether it's working.

The problem, I believe, lies in Bloop. We might be able to put a workaround into Fury, but it would be great to be able to diagnose the issue as a Bloop problem and have it fixed there. The Bloop configuration file at .bloop/something.json should be enough to test this without involving Fury.

propensive commented 4 years ago

The first thing to try would be to use the latest version of Scala 3.

propensive commented 4 years ago

The latest version of Scala 3 doesn't fix the problem, but the problem does appear to happen only on the first attempt to run the Scala 3 compiler. The test has been updated to attempt the build a second time, if necessary.

anarmanafov1 commented 4 years ago

@propensive is this issue still up for grabs? I would like to take this on.

propensive commented 4 years ago

@anarmanafov1 Sorry, I only just saw your comment! And the answer is yes!

But I've since discovered a bit more about it: it doesn't just affect Scala 3. I think it affects every version of Scala, and only on the very first run of that version. It is probably not a coincidence that those are the times when a new compiler-interface gets built for that particular version of Scala.

I'm happy to devote some time to helping, if you're still interested?

anarmanafov1 commented 4 years ago

Hey @propensive, I am still interested in helping out. I was setting up the project and encountered an ssl certificate handshake error: curl: (60) SSL certificate problem: certificate has expired

I am assuming you need to perform the update?

odisseus commented 4 years ago

Can you attach the logs? They should be in ~/.cache/fury/logs.

What site does have the certificate problem?

anarmanafov1 commented 4 years ago

@odisseus I don't have a fury directory in my ~/.cache/ but I am able to reproduce that error by running this manually: curl https://gateway.pinata.cloud/ipfs/*************

I got the url from by adding an echo statement to the fury executable at the step where it fails (within installFury()).

odisseus commented 4 years ago

Works on my machine. That's the address of an IPFS gateway server managed by Pinata. I doubt they really have any problems with their SSL certificates, and if they really do, those are probably going to be fixed soon.

Can you access https://gateway.pinata.cloud with your browser?

anarmanafov1 commented 4 years ago

@odisseus strange that it works on your machine. After doing a bit of research I am lead to believe that this is an issue on Pinata.

I got around this running curl as insecure by adding a -k.

anarmanafov1 commented 4 years ago

The installation completes error free with the exception of this log in the end:

/Users/home/.local/share/fury/usr/0.16.1-19-ga1d8749 /Users/home/.local/share/fury/usr/0.16.1-19-ga1d8749/bin/fury
  0.210 Could not find a C compiler (cc) on the PATH
  0.215 Fury will work, but its process name will be java instead of fury and will use the slower Python Nailgun client, instead of a native client
  0.243 Updated /Users/home/.zshrc to include fury on the PATH
  0.246 Could not find the file /Users/home/.bashrc to install fury on the PATH
  0.247 Installing handler for fury:// URLs
  0.276 Installation for fish is not yet supported

After that when I run the integration tests most of them fail. I feel like the C compiler error might have something to do with it. @odisseus @propensive have you seen this error before:Could not find a C compiler (cc) on the PATH? I have tried adding /usr/bin/cc and /usr/bin/gcc to my path but i get the same error.

I will dig deeper again some time tomorrow or next week but was wondering if you have seen this before.

anarmanafov1 commented 4 years ago

Upon taking a closer look I was able to locate the source of the error within the doCCompilation block of Install.scala. When I log the actual exception in doCCompilation(env).recover { case e => I find procname.c:1:10: fatal error: 'sys/prctl.h' file not found.

sys/prctl.h does not exist on mac, only on linux, as explained here: https://github.com/sysown/proxysql/issues/1920

I will need to get linux running to proceed.

anarmanafov1 commented 4 years ago

@propensive I was able to successfully install via make install without any errors. I am able to build fury with the fury command. Integration tests are failing, but I can work them out as I go along.

Are you able to provide some more detail regarding how to reproduce the error you described above?

odisseus commented 4 years ago

The C compiler is used during the installation to give the Fury process a nicer name. This step is entirely optional, and skipping it should not affect the integration tests. However, if our C code doesn't compile on Mac OS, that's a bug.

Some integration tests are known to be affected by the environment. For example, some tests fail if your environment locale has a comma instead of point as the decimal separator. When you run the same test a second time without restarting Fury and clearing the caches, some messages about Bloop connections and repository cloning might be missing. Such differences are normal.

I'd say the installation is successful if Fury is able to build itself.

anarmanafov1 commented 4 years ago

@odisseus thanks for the extra context. Upgrading to scala 2.13.2 fixed most of the integration tests.

Regarding reproducing the bug. I am able to build fury from the fury directory with no errors. I do so by running the fury command. I have scala 2.13.2 installed.

@propensive would you be able to provide steps to reproduce the error the ticket pertains to?

propensive commented 4 years ago

When you say "upgraded to Scala 2.13.2", what did you actually change? Fury will use its own version of Scala (version 2.12.10, I think), so I think that any change you see is likely to be a consequence of restarting something, or just running the tests a second time...

The issue wasn't clear from the ticket, (sorry...) so I've just updated it.

anarmanafov1 commented 4 years ago

@propensive thanks for updating the description, that definitely clears things up.

Could you elaborate a bit on the following step then deletes the compiler interface and restarts Bloop, before running it again such as where the compiler interface is and how to restart the Bloop server.

Alternatively I am thinking to just Dockerize the application and build it in a new docker container and run the scala 3 test. The docker approach is proving to be on the slow side.

propensive commented 4 years ago

I'm not 100% sure where the compiled compiler-interface resides after Bloop creates it. The classfiles or a JAR file should appear for the first compilation of every new version of the Scala compiler (including nightlies). Hopefully that will give a hint about where to find it. I think compiler-interface should appear in the name, or maybe compiler-bridge. You might find something in the Bloop source files, if you search long enough.

Docker should work, ultimately, and might make a more repeatable test, given that it's environment-dependent. It may exacerbate or mitigate the problem, though, especially if it's a race condition. It would be great if we could make a repeatable Bloop test in the form of a Docker container, but that might be too much work: it may just be easier to read the Bloop source code where it happens and spot the race condition...