Closed tkchia closed 1 year ago
The JITter patches are still work in progress, but are already starting to show some real (even if small) improvements in running time. :slightly_smiling_face:
Hello @tkchia,
The JITter changes for Aarch64 are interesting. I am trying to learn more about the ARM v8+ ISA myself; may I ask what you're using for reference materials for the processor instruction decoding and/or whether you've found any particularly nice descriptions of ARM v8? I've found a number of books, but many are older and oriented around 32-bit pre-v8 ARM.
Thank you!
Hello @ghaerr,
I do not know of any particularly "friendly" references on ARMv8, I am afraid. I am also looking for something like ref.x86asm.net
for ARM instruction formats, but no luck so far.
The official ARM Architecture Reference Manual (DDI 0487) is available from arm.com
, so for now I am using that. The Procedure Call Standard for the ARM 64-bit Architecture (IHI 0055), which describes the official AAPCS64 ABI, also used to be available, though it seems to be gone (paywalled?) now. There is still a Programmer's Guide which summarizes the ABI though. Also I recall that Apple macOS and iOS actually use a slightly different ABI.
Thank you!
On my AArch64 box, https://github.com/jart/blink/pull/145/commits/b3afd4863c0b048ee6c9246412ba5181a3a0236f improves the running time of o//blink/blink third_party/cosmo/2/test_suite_ecp.com
by about 5%:
PASSED (130 / 130 tests (73 skipped))
RL: took 14,428,269µs wall time
RL: ballooned to 5,220kb in size
RL: needed 14,145,480µs cpu (0% kernel)
RL: caused 1,256 page faults (99% memcpy)
RL: 172 context switches (11% consensual)
RL: performed 1,016 reads and 8 write i/o operations
versus
PASSED (130 / 130 tests (73 skipped))
RL: took 15,051,129µs wall time
RL: ballooned to 5,200kb in size
RL: needed 14,786,259µs cpu (0% kernel)
RL: caused 1,257 page faults (100% memcpy)
RL: 251 context switches (2% consensual)
RL: performed 0 reads and 8 write i/o operations
By the way o//blink/blink third_party/cosmo/2/test_suite_mpi.com
took about 10 minutes to run on AArch64 — on x86-64 it took about 18 seconds. I guess my AArch64 box is a bit under-powered. :neutral_face:
Thank you!
Hello @ghaerr, hello @jart,
Incidentally, I still find it a bit ... silly that test_suite_mpi.com
needs about 10 minutes to run on my AArch64 box. I suspect though that the JITter will need some significant rearchitecting, if there are to be any major speed improvements.
Thank you!
Hello @tkchia,
Are you thinking that perhaps test_suite_mpi.com
should be commented out for the time being? That test also failed during the CI run, unrelated to my last PR here, although I never figured out why.
Thank you!
@ghaerr : the test is OK, I think. It is just that it takes a long time to run on my box. Thank you!