Open ExtReMLapin opened 1 year ago
Here is the benchmark, i'm taking the bet the difference is from the datastruct used and the %
everywhere
PS > Measure-Command { .\mingw64\luajit.exe .\mandel_fast.lua}
Days : 0 Hours : 0 Minutes : 0 Seconds : 2 Milliseconds : 206 Ticks : 22065291 TotalDays : 2,553853125E-05 TotalHours : 0,00061292475 TotalMinutes : 0,036775485 TotalSeconds : 2,2065291 TotalMilliseconds : 2206,5291
PS > Measure-Command { .\mingw64\luajit.exe .\mandel.lua }
Days : 0 Hours : 0 Minutes : 0 Seconds : 5 Milliseconds : 139 Ticks : 51396523 TotalDays : 5,94867164351852E-05 TotalHours : 0,00142768119444444 TotalMinutes : 0,0856608716666667 TotalSeconds : 5,1396523 TotalMilliseconds : 5139,6523
_fast version is mine mandel.lua.txt mandel_fast.lua.txt
And with the removed % and added
local data = {}
local ptr = 1
if type(rawget(_G, "jit")) == 'table' then
ffi = require("ffi")
data = ffi.new("char[32768]")
jit.opt.start("loopunroll=100")
ffi_fill = ffi.fill
else
data = {}
local i = 0
while i < 32768 do
data[i] = 0
i = i + 1
end
end
It beats mine at 1986ms !
Interesting considering i wrote this without really caring about performance, my goals were compatibility and primitive debugging support. I tried to recreate your results on my machine (Ryzen 5950X, Arch Linux with kernel version 6.4.3):
Your mandel_fast.lua
:
time luajit mandel_fast.lua
# 1.40s user 0.02s system 99% cpu 1.428 total
My transpiler with default options:
./bf2.lua -i mandel.b -o mandel.lua
time luajit mandel.lua
# 3.62s user 0.02s system 99% cpu 3.641 total
My transpiler with a function for each loop:
./bf2.lua -i mandel.b -o mandel.lua -f
time luajit mandel.lua
# 68.46s user 0.01s system 99% cpu 1:08.52 total
My transpiler with maximum optimizations:
./bf2.lua -i mandel.b -o mandel.lua -O 2
time luajit mandel.lua
# 2.92s user 0.02s system 99% cpu 2.942 total
My transpiler with maximum optimizations and your jit code:
./bf2.lua -i mandel.b -o mandel.lua -O 2
# manually edited mandel.lua
time luajit mandel.lua
# 4.38s user 0.25s system 99% cpu 4.647 total
My transpiler with maximum optimizations and without % max
:
./bf2.lua -i mandel.b -o mandel.lua -O 2
# manually edited mandel.lua
time luajit mandel.lua
# 2.07s user 0.01s system 99% cpu 2.082 total
My transpiler with maximum optimizations, your jit code and without % max
:
./bf2.lua -i mandel.b -o mandel.lua -O 2
# manually edited mandel.lua
time luajit mandel.lua
# 1.51s user 0.01s system 99% cpu 1.515 total
I am mostly happy with the performance, but might add an option to disable the % max
, for programs that don't rely on wrapping cells, edit: done in ad7a0ccdbf25281571175b44585dad099d20279b
The problems with your optimizations (and reasons why i won't adapt them) are a lack of support for PUC Lua and programs that rely on wrapping 8-bit cells:
lua mandel_fast.lua
lua: mandel_fast.lua:196: attempt to perform arithmetic on a table value (local 'data')
stack traceback:
mandel_fast.lua:196: in main chunk
[C]: in ?
For example this "Hello world" program, which doesn't work with your transpiler:
-[>-<+++++++]>-.[<----->+]<---.+++++++..+++.+[>--<-]>.---[<--->+]<.+[>--<---]>-.+++.------.--------.-[<->+++]<.
Just a note: On luajit
these three lines seem to compile to exactly zero jit instructions.
So if you can ensure that the %256
is inside the if
it has no impact on performance.
In addition if the type of the tape cells is a byte (char
or uint8_t
) the %256
is already automatically done.
local nj
nj = (type(jit) ~= 'table')
if nj then m[p+1] = m[p+1]%256 end
Yes this is why in the VM settings i set it to char
@rdebath Thanks, it is good to know that luajit can optimize away that if statement.
I thought about adding an option to use a luajit array as memory, but have decided against it for now, for a few reasons:
However using the -maximum 0
option no % max
will be generated, and replacing the initial boilerplate of the output is relatively easy. If someone else cares about this, pull requests are always welcome.
Hello, Interesting project, did you run benchmarks, for example on mandelbrot ?
Here is my transpiler https://github.com/ExtReMLapin/fast_brainfuck.lua