ikvmnet / ikvm

A Java Virtual Machine and Bytecode-to-IL Converter for .NET
Other
1.17k stars 110 forks source link

Return same Lambda where Lambda doesn't have captures #376

Closed TheLastRar closed 1 year ago

TheLastRar commented 1 year ago

Attempts to resolve https://github.com/ikvmnet/ikvm/issues/342 by implementing the singleton pattern to lambda with no captures.

When running the test case in https://github.com/ikvmnet/ikvm/issues/342#issuecomment-1581194385

IKVM with this pr Function Ref Same Instance GiveObjectSupplier1() Same Instance GiveObjectSupplier2() Same Instance ObjectToStringSupplier Different Instance

OpenJDK Function Ref Different Instance GiveObjectSupplier1() Same Instance GiveObjectSupplier2() Same Instance ObjectToStringSupplier Different Instance

I know that this pr returns the same instance for the Function Ref test while OpenJDK returns different instances, it seems that IKVM uses the same class for this, where as OpenJDK generates different classes for each ref. Hopefully that difference doesn't cause issues.

CLAassistant commented 1 year ago

CLA assistant check
All committers have signed the CLA.

wasabii commented 1 year ago

So, having had a bit to take a look at this...

Thanks! Appreciate others getting into it.

I think storing the opcode isn't needed. At a minimum, it is known during the call to Emit whether the MethodBuilder stored on the LambdaMetafactory is a ctor of a type or not. The call to ReflectionUtil can simply be there, inline.

But, after thinking through this, and understanding the point of LambdaMetafactory, I have an alternate, probably more simple design: always call a static method. No need to even have a NewObj path in the main code emit pass. Just have a call to a method, and have the body of that method determine whether it's going to create a new instance, or retrieve an existing one. This should make the code a bit simpler: all CallSites produced by LambdaMetafactory would look the same, the calling convention for them shouldn't change. They would all be a call to __<GetInstance>. The implementation of __<GetInstance> then can be generated differently, deciding whether to initialize a cached instance, or return a new instance, depending on whether there are captured args.

The JIT would ultimately inline this anyways. __<GetInstance> would essentially be nargs + 2 instructions. And we'd be reducing the conceptual complexity of the output code a bit. It would always be calling __<GetInstance>, it's just that the implementation of __<GetInstance> would be different. The Castclass would end up inside __<GetInstance> as well.

Right now the code is a little weird to follow. We're invoking a method called CreateConstructorAndDispatch, which may or may not actually return a constructor. It has a local variable named ctor. Which starts out as a ctor, but then is later replaced by something which isn't a ctor, but which calls the original ctor. And then it returns that, and the code that called it has to redetermine whether it's a ctor or not (didn't it just know?)

For some background, I think, that might be useful to mention here. IKVM.Internal.LambdaMetafactory is ultimately an optimization, which catches specific and special uses of the invokedynamic instruction (those that call java/lang/invoke/LambdaMetafactory.metafactory as a bootstrap method), and instead of following the normal OpenJDK code path (of actually calling metafactory()), instead emit special IL. I have it in my mind some day to actually remove it, and replace it with the normal metafactory() implementation, which generates dynamic byte code. But that day isn't today.

TheLastRar commented 1 year ago

But, after thinking through this, and understanding the point of LambdaMetafactory, I have an alternate, probably more simple design: always call a static method. No need to even have a NewObj path in the main code emit pass. Just have a call to a method, and have the body of that method determine whether it's going to create a new instance, or retrieve an existing one. This should make the code a bit simpler: all CallSites produced by LambdaMetafactory would look the same, the calling convention for them shouldn't change. They would all be a call to __<GetInstance>. The implementation of __<GetInstance> then can be generated differently, deciding whether to initialize a cached instance, or return a new instance, depending on whether there are captured args.

Done

For some background, I think, that might be useful to mention here. IKVM.Internal.LambdaMetafactory is ultimately an optimization, which catches specific and special uses of the invokedynamic instruction (those that call java/lang/invoke/LambdaMetafactory.metafactory as a bootstrap method), and instead of following the normal OpenJDK code path (of actually calling metafactory()), instead emit special IL. I have it in my mind some day to actually remove it, and replace it with the normal metafactory() implementation, which generates dynamic byte code. But that day isn't today.

Will the normal metafactory() implementation behave as OpenJDK does with respect to https://github.com/ikvmnet/ikvm/issues/342?

wasabii commented 1 year ago

Will the normal metafactory() implementation behave as OpenJDK does with respect to https://github.com/ikvmnet/ikvm/issues/342?

It very well might. I don't think I know enough about it right now to answer that.

wasabii commented 1 year ago

https://github.com/openjdk/jdk8u/blob/0bec498fc945a49f1134e02eb477e0af15db3b5b/jdk/src/share/classes/java/lang/invoke/InnerClassLambdaMetafactory.java#L182-L184

I think so.

This is the standard Metafactory instance that generates the CallSite out of the box. It looks like if there are no arguments, it uses ConstantCallSite, set to a specific instance. So, yeah.

I haven't looked too deeply down the path of whether it's a good idea to replace LambdaMetafactory in our compiler. From the POV of static compilation, I think it'll end up converting what we currently emit as plain IL, as a call to an object that will be writing and converting bytecode at runtime. It'll be slower. But how much slower? Is it a problem? Does it preclude any avenues we might be interested in in the future (nativeAOT, etc).

TheLastRar commented 1 year ago

This is the standard Metafactory instance that generates the CallSite out of the box. It looks like if there are no arguments, it uses ConstantCallSite, set to a specific instance. So, yeah.

Good to know

I haven't looked too deeply down the path of whether it's a good idea to replace LambdaMetafactory in our compiler. From the POV of static compilation, I think it'll end up converting what we currently emit as plain IL, as a call to an object that will be writing and converting bytecode at runtime. It'll be slower. But how much slower? Is it a problem?

I guess the only way to find out would be benchmarks or profiling real world applications. Although it sounds like it impact startup/first run performance more than anything else.