Open remexre opened 3 years ago
Cool!
Would finishing the Copper API work help at all with the ant and copper issues?
Not sure that I understand the issue with autocopy, though. Is this a problem with using reflection in general? If so then the use of Silver's reflection library for interface file handling would also pose a problem. I'm not sure that we want to jump straight to deprecating autocopy
everywhere just yet.
Oh, wow; just tried this on ableC; (after using the native-image agent) it "just works" with no code changes! (this is an unextended compiler)
$ time java -Xss6M -jar ableC.jar testing/tests/melt/positive/1.c
java -Xss6M -jar ableC.jar testing/tests/melt/positive/1.c 6.59s user 0.50s system 187% cpu 3.770 total
$ time ./ableC testing/tests/melt/positive/1.c
./ableC testing/tests/melt/positive/1.c 0.17s user 0.10s system 99% cpu 0.271 total
Wow is right... This is very impressive.
On Mon, May 3, 2021 at 1:44 PM Nathan Ringo @.***> wrote:
Oh, wow; just tried this on ableC; (after using the native-image agent) it "just works" with no code changes! (this is an unextended compiler)
$ time java -Xss6M -jar ableC.jar testing/tests/melt/positive/1.c java -Xss6M -jar ableC.jar testing/tests/melt/positive/1.c 6.59s user 0.50s system 187% cpu 3.770 total
$ time ./ableC testing/tests/melt/positive/1.c ./ableC testing/tests/melt/positive/1.c 0.17s user 0.10s system 99% cpu 0.271 total
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/melt-umn/silver/issues/512#issuecomment-831456042, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZ2C53QWHF6RTSRETTUAHLTL3VHPANCNFSM43M3PFUQ .
And this is without removing the autocopy attributes in ableC?
I would be interested to see the performance numbers for an extension that uses object-language syntax (requiring reflection) - maybe ableC-rewriting or ableC-prolog. How hard is it to try this out this right now?
And this is without removing the autocopy attributes in ableC?
Yep; looks like the agent is good enough to resolve all that reflection.
I would be interested to see the performance numbers for an extension that uses object-language syntax (requiring reflection) - maybe ableC-rewriting or ableC-prolog. How hard is it to try this out this right now?
Right now I have this all strung together with my nascent + not-yet-separated-from-my-projects Nix setup for Silver, but if you don't have Nix, the steps are roughly:
config-output-dir
to config-merge-dir
zip
, not jar
, because jar
will blow away other stuff in META-INF
-H:Name
specifies the output filenamenative-image
to use gigs of memory and take minutes; 8G and 5min for the unextended ableCIf this sounds like a pain, I can do it; lmk what JAR I should use
Because of the last bullet, I think the perf gains here probably change the goal of silvir, not invalidate it -- it's still worth doing, because we should be able to get comparable perf without 8G+5min of figuring out info the compiler had anyway.
I think it does mean that someone should light a fire under my chair wrt the de-ant
ifying (#400); that's the only blocker I know of to doing this to Silver, which I expect would improve the developing-things-other-than-Silver-in-Silver experience a lot.
ableC-prolog didn't work with a few test cases; a proper solution there might be a late-in-compile phase to create a reflect-config.json
file (example). I think this would have to be a whole-program analysis, but at least would be a cheap one, and would only be done for native-image
/SubstrateVM builds anyway.
Okay, looks like problem last time I tried this on silver
was that I screwed up the scripts; I fixed them now. Below results are on silo
, which has approximately the same hardware as foundry
.
Building the native binary takes ~7m50s.
A native self-compile takes ~3m38s.
A JVM self-compile takes ~2m26s.
I need to go digging for why the performance is unexpectedly worse... perf
didn't work on the binary, even with the -H:+PreserveFramePointer -H:-DeleteLocalSymbols
options.
If someone else wants to try too, the patch I'm using is here.
The plot thickens; I rebuilt it with --initialize-at-build-time
and got
00:00:00 Congrats, you're using Silver Native!
00:00:00 Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 34 out of bounds for length 34
00:00:00 at silver.compiler.analysis.warnings.Init.setupInheritedAttributes(Init.java:47)
00:00:00 at silver.compiler.analysis.warnings.Init.init(Init.java:23)
00:00:00 at silver.compiler.analysis.warnings.flow.Init.init(Init.java:57)
00:00:00 at silver.compiler.definition.flow.syntax.Init.init(Init.java:39)
00:00:00 at silver.compiler.definition.type.syntax.Init.init(Init.java:34)
00:00:00 at silver.compiler.modification.autocopyattr.Init.init(Init.java:34)
00:00:00 at silver.compiler.definition.flow.driver.Init.init(Init.java:45)
00:00:00 at silver.compiler.driver.util.Init.init(Init.java:41)
00:00:00 at silver.compiler.definition.env.Init.init(Init.java:31)
00:00:00 at silver.compiler.analysis.typechecking.core.Init.init(Init.java:33)
00:00:00 at silver.compiler.definition.flow.env.Init.init(Init.java:38)
00:00:00 at silver.compiler.modification.let_fix.java.Init.init(Init.java:34)
00:00:00 at silver.compiler.composed.Default.Init.init(Init.java:85)
00:00:00 at silver.compiler.composed.Default.Main.main(Main.java:10)
So maybe we're triggering some bugs somewhere...
Increased thickening continues, Chris Seaton pointed out on the GraalVM Slack that SubstrateVM has a worse GC than the standard JVM; passing -H:InitialCollectionPolicy='com.oracle.svm.core.genscavenge.CollectionPolicy$NeverCollect'
to disable garbage collection sped up the native build to be on par with the standard JVM (timestamped build log).
Oh, wow, it's definitely the reflection; term_rewriting.jar from https://github.com/melt-umn/lambda-calculus takes ~8s to run on e4.lambda (in the same); it takes ~20s to run after compilation. Prodletons, here we come?
Okay, well, maybe something spookier is going on; the above 20s figure was with garbage collection off, since SubstrateVM's GC is supposed to be slower than the standard JVM's; using the default GC settings lowered runtime to ~12s; still slower, but not so absurdly so.
Spooks confirmed... I again see much worse performance with GC off; weirdly, time is spent in sys
? From an strace of each, the mmaps are just more expensive? Need more analysis. gtg rn, results and scripts here
Latest batch of logs, from same scripts on top of #539. Highlights:
peak memory use with:
jvm-gc-noop: 16GiB jvm-gc-clean: 16GiB native-gc-noop: 860 MiB native-gc-clean: 7.6 GiB native-epsilon-noop: 9.6 GiB native-epsilon-clean: 51 GiB
total time:
jvm-gc-noop: 21s jvm-gc-clean: 4m10s native-gc-noop: 38s native-gc-clean: 4m31s native-epsilon-noop: 24s native-epsilon-clean: 2m34s
Will try to investigate some more tomorrow, but I suspect this means we're GC-limited? These results are "less clean" than the previous ones; they're on foundry while Jenkins was running, so they probably got some interference; would've tested on silo, but it's got unrelated Silver changes on its local checkout... will try reproing them tomorrow, since e.g. jvm-gc-clean
took like 15% longer than it did last time.
Oh, just realized the gains are being "muffled" by the javac
run time; the times up until printing Buildfile: /home/nathan/melt/silver/build.xml
are:
ugh, if the epsilon memory use weren't so terrible... maybe if silvir does its own native codegen, we could try out something clever with reference counting on the page level, and having a second thread concurrently collect dead pages; that ought to provide epsilon-like performance while lowering peak memory use dramatically? dunno, would need experiments.
At this point, next task is probably running some profiler that supports both the Hotspot VM and SubstrateVM, and comparing the traces to see what's more expensive on native-epsilon-noop
vs jvm-gc-noop
; most Silver builds are closer to noop
(i.e. all svi files) than clean
, so a regression here is pretty tragic...
Looks like as of GraalVM 21.0.0,
native-image
can build Silver and Silver programs with JVM fallback; I don't think this is actually useful yet, but it's progress so I figure I'll open a tracking issue. (Note that this issue is wholly unrelated to SilvIR-on-Truffle or anything, and is just "can we do initialization and JIT warmup at build time.")native-image
'd Silver (with JVM fallback) currently isn't usable; it exits with 0 when it should call out to Ant.On a small program that does not use autocopy attributes (after patching
silver:langutil:pp
to remove its use of them),native-image
with fallback cuts about 7% of execution time off (NB: didn't do proper stats, just eyeballed 5 runs after a 5-run warmup of each).On the same program,
--no-fallback
fails due to Copper's use ofjava.io.ObjectInputStream
; it appears a config file is needed to statically list the classes this will be used with. Providing https://p.remexre.xyz/Xj6WvLaiMAA= (as generated by the GraalVM native-image agent) via a bit of zipfile surgery, I get an amazing speedup from ~1.56sec to ~0.02sec!Unfortunately, said config files don't work for Silver, which still dies early on when doing stuff with autocopy:
I'll try replacing every
autocopy
attribute in Silver withinherited
+propagate
after today's meeting, I guess?