Closed joinr closed 3 years ago
For some reason it looks like the expressions aren't compiled to the same bytecode. Decompiled areduce:
public final class bench$fn__19404 extends AFunction
{
public static final Var const__0;
public static Object invokeStatic() {
final Object a__6487__auto__19406 = bench$fn__19404.const__0.getRawRoot();
final Object l__6488__auto__19407 = Reflector.invokeStaticMethod(RT.classForName("clojure.lang.RT"), "alength", new Object[] { a__6487__auto__19406 });
long idx = 0L;
long acc = 0L;
while (Numbers.lt(idx, l__6488__auto__19407)) {
final long n = RT.intCast(idx) + 1;
acc = ((long[])bench$fn__19404.const__0.getRawRoot())[RT.intCast(idx)];
idx = n;
}
return Numbers.num(acc);
}
@Override
public Object invoke() {
return invokeStatic();
}
static {
const__0 = RT.var("clj-fast.bench", "numarr");
}
}
decompiled defn
public final class bench$traverse_arr extends AFunction
{
public static Object invokeStatic(final Object arr) {
final Object a__6487__auto__19396 = arr;
final int l__6488__auto__19397 = ((long[])a__6487__auto__19396).length;
long idx = 0L;
Object acc = null;
while (idx < l__6488__auto__19397) {
final long n = RT.intCast(idx) + 1;
acc = Numbers.num(RT.aget((long[])arr, RT.intCast(idx)));
idx = n;
}
return acc;
}
@Override
public Object invoke(final Object arr) {
return invokeStatic(arr);
}
}
I think there's a slight risk of testing in the repl
I didn't think to decompile. Definitely different implementations; one goes through numbers ns (lt vs <), and even has a really round-about lookup for the long array (binding to const0 and doing the rawroot lookup). Very curious. Wonder why this doesn't happen in the defn...
Probably the compiler evaluates it differently than it does a dynamic form
Given what I've found here I'm closing this issue. Feel free to reopen it whenever you think is appropriate or with new findings.
It may not make a difference, but something I missed when doing some recent toy profiling at the REPL measuring traversals between arrays and vectors:
When testing
areduce
over a primitive array vs. a call toreduce
over a vector for iteration comparison, the rawareduce
expression ended up being confusingly slower (or close to) the boxed HAMT vector reduction. This seemed very odd, since prior experience indicated primitive array traversal was blazingly fast. I then ensured that the macroexpansion fromareduce
happened inside a wrapper function, liketraverse-arr
and got my expected performance.
It seems the JIT is kicking in on the tiny function, but not the
areduce
call, which is a macro expansion into aloop/recur
form. Very interesting. Might be worth a look to make sure the JIT isn't being restricted in the inlined forms as well (I haven't looked hard).