Closed slandelle closed 5 years ago
This is using sbt 1.3.0-RC2, right? Not sbt 1.2.8?
@dwijnand Hard too tell which milestone as there are lots of different versions for the modules:
[INFO] +- org.scala-sbt:zinc_2.12:jar:1.3.0-M7:compile
[INFO] | +- org.scala-lang:scala-library:jar:2.12.8:compile
[INFO] | +- org.scala-sbt:zinc-core_2.12:jar:1.3.0-M7:compile
[INFO] | | +- org.scala-sbt:zinc-apiinfo_2.12:jar:1.3.0-M7:compile
[INFO] | | | \- org.scala-sbt:compiler-bridge_2.12:jar:1.3.0-M7:compile
[INFO] | | +- org.scala-sbt:zinc-classpath_2.12:jar:1.3.0-M7:compile
[INFO] | | | \- org.scala-lang:scala-compiler:jar:2.12.8:compile
[INFO] | | +- org.scala-sbt:compiler-interface:jar:1.3.0-M7:compile
[INFO] | | | \- org.scala-sbt:util-interface:jar:1.3.0-M6:compile
[INFO] | | +- com.trueaccord.scalapb:scalapb-runtime_2.12:jar:0.6.0:compile
[INFO] | | | +- com.trueaccord.lenses:lenses_2.12:jar:0.4.12:compile
[INFO] | | | \- com.lihaoyi:fastparse_2.12:jar:0.4.2:compile
[INFO] | | | +- com.lihaoyi:fastparse-utils_2.12:jar:0.4.2:compile
[INFO] | | | \- com.lihaoyi:sourcecode_2.12:jar:0.1.3:compile
[INFO] | | +- org.scala-sbt:io_2.12:jar:1.3.0-M10:compile
[INFO] | | | +- com.swoval:file-tree-views:jar:2.1.1:compile
[INFO] | | | +- net.java.dev.jna:jna:jar:4.5.0:compile
[INFO] | | | \- net.java.dev.jna:jna-platform:jar:4.5.0:compile
[INFO] | | +- org.scala-sbt:util-logging_2.12:jar:1.3.0-M6:compile
[INFO] | | | +- com.eed3si9n:sjson-new-core_2.12:jar:0.8.2:compile
[INFO] | | | +- jline:jline:jar:2.14.6:compile
[INFO] | | | +- org.apache.logging.log4j:log4j-api:jar:2.11.2:compile
[INFO] | | | +- org.apache.logging.log4j:log4j-core:jar:2.11.2:compile
[INFO] | | | +- com.lmax:disruptor:jar:3.4.2:compile
[INFO] | | | +- com.eed3si9n:sjson-new-scalajson_2.12:jar:0.8.2:compile
[INFO] | | | | +- com.eed3si9n:shaded-scalajson_2.12:jar:1.0.0-M4:compile
[INFO] | | | | \- org.spire-math:jawn-parser_2.12:jar:0.10.4:compile
[INFO] | | | \- org.scala-lang:scala-reflect:jar:2.12.8:compile
[INFO] | | \- org.scala-sbt:util-relation_2.12:jar:1.3.0-M6:compile
[INFO] | +- org.scala-sbt:zinc-persist_2.12:jar:1.3.0-M7:compile
[INFO] | | \- org.scala-sbt:sbinary_2.12:jar:0.4.4:compile
[INFO] | | \- org.scala-lang.modules:scala-xml_2.12:jar:1.0.5:compile
[INFO] | +- org.scala-sbt:zinc-compile-core_2.12:jar:1.3.0-M7:compile
[INFO] | | +- org.scala-sbt:launcher-interface:jar:1.0.0:compile
[INFO] | | +- org.scala-lang.modules:scala-parser-combinators_2.12:jar:1.1.2:compile
[INFO] | | \- org.scala-sbt:util-control_2.12:jar:1.3.0-M6:compile
[INFO] | +- org.scala-sbt:zinc-classfile_2.12:jar:1.3.0-M7:compile
[INFO] | \- com.google.protobuf:protobuf-java:jar:3.7.0:compile
1.3.0-M6 maybe?
Ok, you're not using sbt at all, you're using Zinc.
@dwijnand Yes, but the crash happens in a dependency that's being pulled transitively from sbt modules.
Yes, you're right. From your zinc usage, to sbt/io, and then swoval.
From https://www.lightbend.com/blog/sbt-1.3.0-release#toc_4:
faster recursive directory listing -- sbt internally uses a native library, swoval, that provides a jni interface to native os apis that allow for faster recursive directory listing than the implementations in the java standard library.
Having a look at the native libs shipped in the file-tree-views jar, it ships binaries for i686 and x86_64. Could it be that the swoval-files0.dll
that was loaded was the one for x86_64 instead of i686?
@slandelle thank you for the report. It does look like the wrong DLL was loaded. I will look into fixing that upstream. In the meantime, there is a workaround (its omission from the release notes was an oversight): you can run whatever program with the -Dswoval.directory.lister=com.swoval.files.NioDirectoryLister
system property and the native code should not be used.
Could you please elaborate why sbt needs such native library in the first place? Listing directories, really?
Please keep your comments respectful. The second sentence was unnecessary and I found it insulting. Every change that we have made to sbt has been made to benefit users. But we are fallible and bugs inevitably arise.
The reason for the native library for directory listing is that it can be much faster than the jvm built ins. I ran this test just now in a project that contains 5000 source files:
scala> def time[R](iterations: Int)(f: => R): Unit = {
| val now = System.nanoTime
| (1 to iterations) foreach (_ => f)
| val elapsed = System.nanoTime - now
| println(s"Took ${elapsed / 1.0e6 / iterations} ms per iteration")
| }
time: [R](iterations: Int)(f: => R)Unit
scala> time(10)(file("").getCanonicalFile.allPaths.get)
Took 64.975042 ms per iteration
scala> time(10)(file("").getCanonicalFile.allPaths.get)
Took 44.3427132 ms per iteration
Then I ran again with sbt -Dswoval.directory.lister=com.swoval.files.NioDirectoryLister
scala> time(10)(file("").getCanonicalFile.allPaths.get)
Took 143.69171690000002 ms per iteration
scala> time(10)(file("").getCanonicalFile.allPaths.get)
Took 126.3960485 ms per iteration
That manifests as sbt 1.3.0 starting compilation 60 to 80ms faster than sbt 1.2.8 in this project. There are many small performance improvements like this in sbt 1.3.0 that cumulatively add up to something more significant.
@eatkins Sorry if I sounded rude, it wasn't my intent. If I did, I apologize.
My intent was to express my surprise: I would expect a native lib binding to be used for compression, encryption, or kernel metrics, but definitively not in a build tool, even less experiencing a JVM crash there (and the JetBrains folks neither as IDEA doesn't report that the compilation process crashed and just says the class could not be found).
Now, I do have some concerns:
Native bugs aren't forgiving at all: JVM crashes, memory or fd leaks. The portability the JVM provides is lost, so library must be implemented and tested for a very wide range of OS and configurations. There's a chance sbt is going to stop working on configuration it worked so far. 32 bits? ARM? Solaris? AIX? Alpine? Prod configurations where tmp dir is mounted with noexec?
From the Github stats, your library doesn't look trivial (+800 commits). Developers come and go. From the commit log, the truck factor seems to be 1. What will happen if you stop working on it? Will someone take over, or will the library die like Sigar?
I'm not sure I get your numbers. How much gain is experienced on a full sbt build? Do you have some numbers from the Scala community builds? If we're talking only 80ms on boot time, is the change worth the risks?
@slandelle I won't question your intention. If you had said, "I am surprised that a native library is used for this," then I would not have been offended. I accept the apology.
I appreciate your concerns and will address them below. I also want to reiterate that you found a bug, not a fundamental design flaw. We expect bugs during the RC cycle. The bug that you found is actually that the 32 bit library provided for windows didn't work. It wasn't that sbt tried to load the wrong dll. It loaded the right dll and it was broken. I made a mistake by including a binary in the swoval library that I hadn't tested. I had tested 32 bit and 64 bit binaries on linux and just assumed that if cross compilation worked in linux, it would also work on windows. That was wrong. I've removed the 32 bit windows binary from swoval which should resolve your issue since swoval falls back to a nio based implementation if no binary is available for the target platform.
Just so this is clear, we need to provide a native library for file system monitoring on osx because the built in jvm watch service doesn't work on most jdks on osx. Like it or not, swoval is implemented using jni, not jna, and this is not a design decision that we can back out of for 1.3.0. For this reason, sbt has to be able to dynamically load the swoval native library on osx. It is optional for windows and linux. Note that the swoval native library has been a part of sbt since 1.1.5 on osx and there haven't been any issues reported with it. I'd estimate that at least 90% of the work I put into the swoval library had to do with file watching, not directory listing. The directory listing optimizations are dead simple but require the use of posix (or windows) apis that aren't directly available in the jvm.
Now, I do have some concerns: Testing Native bugs aren't forgiving at all: JVM crashes, memory or fd leaks. The portability the JVM provides is lost, so library must be implemented and tested for a very wide range of OS and configurations.
You are absolutely right that native failures are problematic and can easily crash the jvm. Resource leaks can happen just as easily in the jvm so I'm less swayed by that concern.
The directory listing code is roughly 30 lines of code (excluding comments and blank lines). There is no way to implement fast directory listing without jni or jna (and jna comes with its own maintenance issues relative to jni).
There's a chance sbt is going to stop working on configuration it worked so far. 32 bits? ARM? Solaris? AIX? Alpine?
Breaking configurations that previously worked is a fact of life in software development but is something that we try to avoid whether or not we are using native libraries. That concern seems separate from the issue at hand.
I agree with you that we shouldn't distribute untested native binaries. We can't test every linux flavor, but the native directory lister exclusively uses posix apis and makes very similar native calls to the jdk.
I'd be surprised if more than 5% of sbt users are using anything other than 64 bit mac, windows or linux. I am perfectly fine with native directory listing not working on those other platforms as long as the fallback mechanism works. I have no evidence that it doesn't. If, say, after release an ARM user reports an issue, then we can fix it in a patch release. FWIW, intellij only supports the three major 64 bit platforms even though it's written primarily in java. That's not to say we shouldn't support other platforms, but just to illustrate how dominant those platforms are.
Prod configurations where tmp dir is mounted with noexec?
Are you assuming that swoval always extracts the native libraries into /tmp? By default, the swoval library does in fact extract the native library to the /tmp/swoval-jni directory, but that directory is configurable via a system property. sbt sets it to ~/.sbt/1.0/swoval-jni. At any rate, the noexec case is effectively the same as there being no library provided for the target platform because System.load throws an UnsatisfiedLinkError either way.
2) Maintenance From the Github stats, your library doesn't look trivial (+800 commits). Developers come and go. From the commit log, the truck factor seems to be 1. What will happen if you stop working on it? Will someone take over, or will the library die like Sigar?
The plan is to in-source swoval after the sbt 1.3.0 release. We would repackage all of its code under sbt.io so that it will no longer be a dependency. It shouldn't take more than a few days, but is lower priority than releasing sbt 1.3.0. For what it's worth, the swoval dependent code in sbt is very limited. I made a point of redefining all of the swoval apis used by sbt in the sbt.io or sbt.internal.io packages. The swoval library is just used by delegation for implementation. With the exception of the mac native file watching code, the swoval implementations could relatively easily be removed in the future if I was no longer contributing to the project.
3) Performance I'm not sure I get your numbers. How much gain is experienced on a full sbt build? Do you have some numbers from the Scala community builds?
The switch to the native directory lister is to benefit interactive sessions in which many tasks are run. In particular, many of the performance optimizations for 1.3.0 were specifically made with the ~
command in mind (e.g. ~test
). The native directory listing will have little benefit on sbt boot time and would not likely be measurable in the community build.
Whenever a user types compile
or a build is triggered by a file event in a ~
command, sbt must gather all of the relevant source files for the task. The benchmarks I posted in the previous comment were trying to demonstrate the performance difference between listing a directory with many files with and without the native directory lister. sbt cannot begin compilation until it has gathered all of the source files, so if I save 40ms there, then every time the command runs, it is 40ms faster. Whether or not this is noticeable depends on how many files are in the build, how sensitive you are and how long everything else takes.
I do not have benchmarks that test all of the specific sbt 1.3.0 performance optimizations in isolation, but I have generated some end to end benchmarks scala-build-watch-performance that incorporate all of the optimizations.
is the change worth the risks?
There are valid arguments in either direction. My opinion though is that, yes, it is. I am not personally convinced that the risks are all that high in spite of your unfortunate encounter with the win32 bug.
You raised a number of good points. Most of them though were a part of my thinking while implementing these changes. Because of my own paranoia, I carefully designed all of this functionality so that it could both be manipulated at runtime (e.g. -Dswoval.directory.lister=com.swoval.files.NioDirectoryLister
) or easily re-implemented if issues arise. As a result of this discussion, I am going to add a few more safeguards to swoval to try and ensure that the native library is only used on explicitly supported platforms.
I realize that in the issue, the OS was 64 bits:
OS: Windows 10.0 , 64 bit Build 17763 (10.0.17763.475)
I interpreted "Client VM" as a 32 bits JVM but I might be wrong (could be just the JVM heuristics, but I can't check as I don't have such set up at hand):
CPU:total 4 (initial active 4) (2 cores per cpu, 2 threads per core) family 6 model 69 stepping 1, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, avx2, aes, clmul, erms, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2
Memory: 4k page, physical 3874620k(367196k free), swap 10428220k(4020732k free)
vm_info: Java HotSpot(TM) Client VM (25.211-b12) for windows-x86 JRE (1.8.0_211-b12), built on Apr 1 2019 20:53:53 by "java_re" with MS VC++ 10.0 (VS2010)
I'd be surprised if more than 5% of sbt users are using anything other than 64 bit mac, windows or linux.
Gatling uses Zinc and reaches far beyond the Scala developers. Some of our users can't afford a modern machine and are stuck with 32bits.
Then, there's also the case of ops who want to stick to 32bits for their CI machines because they don't want large pointers memory overhead, or whatever other reason.
And people who install a 32 bits JVM on a 64 bits host.
At any rate, the noexec case is effectively the same as there being no library provided for the target platform because System.load throws an UnsatisfiedLinkError either way.
Do you means that even binary is available for the target platform, if binary can't be loaded, sbt won't throw but will silently fallback?
The native directory listing will have little benefit on sbt boot time and would not likely be measurable in the community build. Whenever a user types compile or a build is triggered by a file event in a ~ command, sbt must gather all of the relevant source files for the task.
It looks like I should go with -Dswoval.directory.lister=com.swoval.files.NioDirectoryLister
for both scala-maven-plugin and Gatling then, as compilation is only one shot.
Please advertise if the System prop and its value is changed when you internalize and change package.
Thanks for help!
Fixed in https://github.com/sbt/io/pull/247.
@slandelle
Do you means that even binary is available for the target platform, if binary can't be loaded, sbt won't throw but will silently fallback?
There are no more 32 bit binaries bundled in swoval and it will fall back to nio based apis when run on a 32 bit platform. You should hopefully not need to set the system property with sbt 1.3.0-RC3/1.3.0.
@eatkins Thanks a lot! Do you have any ETA?
This coming Monday, hopefully.
Awesome ! We have a release in about one week so we should be able to ship it, thanks!
@slandelle https://github.com/sbt/zinc/releases/tag/v1.3.0-M8 is out.
@eed3si9n Thanks for pinging. I've upgraded Gatling's master and PR is waiting to be merged on scala-maven-plugin.
Side question: Do you have plans for cross building sbt and zinc against Scala 2.13?
steps
Using zinc 1.3.0-M7
problem
expectation
sbt and zinc should work fine on 32 bits JVM.
notes
com.swoval:file-tree-views
library