ikvmnet / ikvm

A Java Virtual Machine and Bytecode-to-IL Converter for .NET
Other
1.22k stars 111 forks source link

Bunch of encodings are missing #329

Closed Kojoley closed 1 year ago

Kojoley commented 1 year ago
cp jdk8 ikvm
437 ok ok
850 ok missing
852 ok missing
855 ok missing
857 ok missing
860 ok missing
861 ok missing
863 ok missing
865 ok missing
866 ok missing
869 ok missing
936 ok ok
65001 missing missing

This makes applications that use log4j to output to console unusable.

public final class Main
{
    public static void main(String[] args)
    {
        java.nio.charset.Charset.forName(System.getProperty("sun.stdout.encoding"));
    }
}
C:\w>chcp 850
850

C:\w>"C:\Program Files\Java\jdk1.8.0_301\bin\javac.exe" Main.java 

C:\w>"C:\Program Files\Java\jdk1.8.0_301\bin\java.exe" Main

C:\w>"G:\ikvm\bin\java.exe" Main
Exception in thread "main" java.nio.charset.UnsupportedCharsetException: ms850
    at java.nio.charset.Charset.forName(Charset.java:531)
    at Main.main(Main.java:34)
    at java.lang.reflect.Method.invoke(Method.java:486)
    at IKVM.Java.Externs.ikvm.runtime.Launcher.run(Unknown Source)
wasabii commented 1 year ago

Hmm. Thanks for this. Thought I had a test case covering those. Will address.

wasabii commented 1 year ago

So, I added some tests to check all of these. They all worked for me except 65001. Which verison of IKVM are you using here?

Kojoley commented 1 year ago

So, I added some tests to check all of these. They all worked for me except 65001. Which verison of IKVM are you using here?

https://github.com/ikvmnet/ikvm/releases/download/8.5.0/IKVM-8.5.0-image-net461-win7-x64.zip

wasabii commented 1 year ago

Got it.

Kojoley commented 1 year ago

I've build hotfix/8.5.1 branch to test if the fix works, but I can't run programs with it altogether (while develop is fine)

C:\w>G:\ikvm\src\java\bin\Release\net461\java.exe Main
Error: Could not find or load main class Main

It doesn't seem to fix cp65001 either, I think it should return UTF-8 like OpenJDK 17 does, replicating JDK8 deficiency isn't a right choice here.

public final class Main
{
    public static void main(String[] args)
    {
        System.out.println(System.getProperty("sun.stdout.encoding"));
        java.nio.charset.Charset.forName(System.getProperty("sun.stdout.encoding"));
    }
}
C:\w>"C:\Program Files\Java\jdk-17.0.5\bin\java.exe" Main
UTF-8
wasabii commented 1 year ago

Forgot about cp65001. I have to think about this more, as I don't even see that encoding in the source anywhere.

Kojoley commented 1 year ago

Forgot about cp65001. I have to think about this more, as I don't even see that encoding in the source anywhere.

https://github.com/openjdk/jdk11u/blob/31396660db3eae3977bbbf8b978b581dfd941d2d/src/java.base/windows/native/libjava/java_props_md.c#L140-L155

wasabii commented 1 year ago

Hmm. In terms of replicating JDK8.... We're trying to walk the fine line of meeting expectations in massive numbers of weird existing java apps. Including those that hacked around JDK8 deficiencies. At least until we explicitly advertise support for a later JDK.

This way apps that do if jdk8 { //hack } continue working.

wasabii commented 1 year ago

Yeah.... That's JDK11 though.

wasabii commented 1 year ago

Is it a thing that exists in later JDK8 versions?

AliveDevil commented 1 year ago

No hint for 65001.

https://github.com/openjdk/jdk8u/blob/master/jdk/src/windows/native/java/lang/java_props_md.c#L138-L151

Kojoley commented 1 year ago

Is it a thing that exists in later JDK8 versions?

I don't see it, but I think they just petty and don't fix it there to force people to migrate to newer versions. At the point when Windows added UTF-8 beta support JDK8 was already released (I believe). log4j for example crashes with exceptions and the workaround is to do -Dsun.stdout.encoding=UTF-8, I think it is better to fix that in IKVM. As I pointed in https://github.com/ikvmnet/ikvm/commit/6a336bb65233fca8205af2c759d2bc11ed2da802#r114436430 that code in JDK probably buggy anyway and if there is software that workarounds these bugs it would replace cp65001 with UTF-8 so fixing should be safe.

Kojoley commented 1 year ago

if there is software that workarounds these bugs it would replace cp65001 with UTF-8 so fixing should be safe.

And there seems to be not a lot of such (OSS) software https://github.com/search?type=code&q=cp65001+path%3A%2F%5C.java%24%2F&p=2, but a few that exists replace cp65001 with UTF-8: https://github.com/shwenzhang/AndResGuard/blob/e4df245d82f27d9a2d0dd108260a3510cbaba849/AndResGuard-core/src/main/java/apksigner/PasswordRetriever.java#L186-L190 https://github.com/openstreetmap/mkgmap/blob/0b3858095e60a27870ed37afbdfc87dc49d4dc55/src/uk/me/parabola/mkgmap/main/TypCompiler.java#L215-L221

wasabii commented 1 year ago

Yeah. This is going to seem annoying to you, then. But I believe we should seek to exactly mirror how OpenJDK 8 works except where it comes to direct .NET integration. Working around the issue in IKVM should be exactly the same as working around it in any other JDK8 distribution. So, documentation that says do X in Java, applies to us as well.

Here's a more pragmatic reason:

We are getting close to being able to replace much of our forked C# code with the original OpenJDK C/C++ code. If we deviate from OpenJDK here, that would block us from porting the same deviation to the new native version without forking the C code in the future. And that's exactly what we want to avoid.

I don't know the background on this particular issue, so I can't really speak to it. But in general, I would say, if there is a bug in OpenJDK: then we should be fixing it in OpenJDK. OpenJDK is also an open source project. Though their rules for adopting changes are burdensome in comparison to us, it is still an open source project.

If this is a bug that the OpenJDK project doesn't want to fix, I'd imagine we'd adopt their same reasoning. If it is one they would fix, then fixing it there would cause us to eventually benefit from that.

wasabii commented 1 year ago

But also, we would of course love help in starting to work on some of the JDK9+ features. :)

Kojoley commented 1 year ago

Yeah. This is going to seem annoying to you, then. But I believe we should seek to exactly mirror how OpenJDK 8 works except where it comes to direct .NET integration. Working around the issue in IKVM should be exactly the same as working around it in any other JDK8 distribution. So, documentation that says do X in Java, applies to us as well.

But there is no other IKVM version except for Java 8.

But also, we would of course love help in starting to work on some of the JDK9+ features. :)

There is no point on working on them for me if they cannot be integrated into IKVM because it would break that exactly mirror how OpenJDK 8 works except where it comes to direct .NET integration.

wasabii commented 1 year ago

We would eventually switch that statement to mirroring JDK9 for IKVM9.

Kojoley commented 1 year ago

We would eventually switch that statement to mirroring JDK9 for IKVM9.

Until then what happens with PRs that implement JDK9 features/fixes?

wasabii commented 1 year ago

Depends. Huge amounts of current work consist of cleaning up IKVM itself such that we can do such a thing.

There are a number of things I'm working on in the background that implement JDK9 features, but are not yet enabled or useful in the code base for JDK8. For instance, reading class file format > 52. The effort to split the class file parsing out into IKVM.ByteCode is to facilitate that path (and also to generally clean up the code base). Basically, an externalized byte code parsing library that decouples that from the main body of IKVM helps us implement those later features there.

The same will apply to the stub generation. That needs to be externalized, so we can support outputting format > 52.

One of the biggest tasks in JDK9 support is going to be runtime support for Jigsaw (modules). Implementing the security isolation between the various module contexts. It is quite possible for this to exist in the existing IKVM code base, without stepping on JDK8 support. Perhaps as a single module which isn't visible to JDK8 user space.

There will be lots of tasks like that: things that can be viewed as general improvements to the IKVM runtime.

There will be lots of other tasks we can accept, that are outside the runtime: for instance, advancing the MSBuild system to facilitate a massively reorganized JDK. It's likely that IKVM.Net.SDK is going to need some radical changes to do this properly.

Or working on the nativelib support. Every step we make towards this goal translates to JDK9 support, because the ability to incorporate more OpenJDK code without rewriting it ourselves obviously makes adopting the JDK9 code easier.

I hope over time to get to a stage where the IKVM.Runtime code is largely independent of whether we're running a JKD8 or JDK9 BCL.

Once enough of this stuff is present, I expect to be able to swap out the openjdk version from 8 to 9, and switch over.

And then repeat for 9+

Kojoley commented 1 year ago

I appreciate your time writing the roadmap but it only confirms that we have orthogonal goals.

I don't use Java or .NET much, I found IKVM because I wanted to see if .Net/Rosylin VM is any better than HotSpot/Graal/Zing, and I wanted to benchmark it a few years ago but neither then nor now I could been able to run it without hitting bugs. I went through dozen of IKVM versions and hit (different) bugs on every one of them. Some bugs I have investigated, isolated and reported here, other are still preventing me even from evaluating the VM viability. I think every other person would just stop at this point already.

One of the biggest tasks in JDK9 support is going to be runtime support for Jigsaw (modules). Implementing the security isolation between the various module contexts. It is quite possible for this to exist in the existing IKVM code base, without stepping on JDK8 support. Perhaps as a single module which isn't visible to JDK8 user space.

I don't know if modules are useful, I heard not that much, and I have a grim experience with them, libraries not working because of the security features. Probably should not be on a priority list, a lot of stuff I believe will run without it just fine.

wasabii commented 1 year ago

Alright. Well, I think I've got to consider this one closed now, to be released in 8.5.1. Continuing to work on the other reports you made though.