Closed mlaggner closed 1 year ago
I made some more changes and the JVM loads now - but I cannot start the main method (could not start main method - Java exception occured. check stderr/logcat). My current approach is: https://github.com/mlaggner/jnigi/blob/master/windows.go
looks like the call to main needs to be adopted to the Windows C.wstring_t
?
Could you use https://pkg.go.dev/golang.org/x/sys@v0.11.0/windows#LoadLibrary instead of existing call? This does seem to use LoadLibraryW, so another approach would be to call that directly.
looks like the call to main needs to be adopted to the Windows C.wstring_t ?
not sure what you mean by that.
Thanks for the hint with windows.LoadLibrary
, but this leads to problems in the following calls (the result of the call windows.LoadLibrary
is a different type than in your code). Unfortunately I have no clue what I am doing here (I am a Java dev and no C/C++ dev) and I do not have a Windows development environment either...
Do you have some more suggestions how to change that?
Ok i've got a solution here:
https://github.com/timob/jnigi/tree/windows_unicode_dll_path_fix
Hope that works for you, working for me.
Thanks - now I get until loading of the JVM, but the JVM is throwing some exception (which is not logged anywhere):
"could not start main method - Java exception occured."
I need to review the code to get a clue what is failing.
BTW: without the unicode character the JVM is starting fine with your changes! 👍
could this be a problem? https://github.com/timob/jnigi/blob/master/cinit.go#L34
the JVM args contain the classpath (-Djava.class.path
) which also contains unicode characters.
I could also imagine that JVM params (and/or app arguments) contain unicode characters
The JNI functions use UTF-8 strings same as Go so the code you linked should not be a problem.
Googling there do seem to be problems on Windows using unicode characters in class paths with OpenJDK in general.
I just tried to call javaw
with the same path and parameters and this works...
jre\bin\javaw.exe -classpath C:\Séries\main.jar;C:\Séries\lib\*;C:\Séries\addons\* -Xms64m -Xmx512m -Xss512k -XX:+IgnoreUnrecognizedVMOptions -XX:+UseG1GC -XX:+UseStringDeduplication -Dsun.java2d.renderer=sun.java2d.marlin.MarlinRenderingEngine -splash:splashscreen.png -Djava.net.preferIPv4Stack=true -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8 -Djna.nosys=true com.Main
Looking at https://stackoverflow.com/questions/20052455/jni-start-jvm-with-unicode-support . It seems the command line utilities like javaw are doing the encoding somehow, but this is not done during JNI invocation.
The suggestion from that stackoverflow is to call System.setProperty with the class path, after you create the VM.
The suggestion from that stackoverflow is to call System.setProperty with the class path, after you create the VM.
Thats wrong, i mean you can set the property but i don't think it effects where classes are found after the JVM is started. There are ways of setting the class path dynamically.
I had a look at the implementation of the java.exe
inside the OpenJDK: https://github.com/openjdk/jdk17/blob/master/src/java.base/share/native/libjli/java.c#L1518
looks like they're passing the JVM args directly to CreateJavaVM. I could not find out which data types are used there (yet)
now I just found that: https://github.com/openjdk/jdk17/blob/master/src/java.base/windows/native/libjli/cmdtoargs.c#L86
this looks like the java CLI executable is converting the JVM args to another format - am I right?
Good news and bad news. Good news is that I've got your example working. Bad news is that looks like the JVM on Windows expects arguments to be encoded in the system code page not UTF-8 (thanks for pointing to the code above), so you are limited to that character set. Usually for latin languages: Windows-1252.
So if you prepare the arguments like this:
import "golang.org/x/text/encoding/charmap"
...
winEnc := charmap.Windows1250.NewEncoder()
winStr, err := winEnc.String(arg)
if err != nil {
panic("charmap.Encoder errror: " + err.Error())
}
...
jnigi.CreateJVM(jnigi.NewJVMInitArgs(false, true, jnigi.DEFAULT_VERSION, []string{winStr}))
your example with "Séries" in it will work.
I think it's probably beyond the scope of JNIGI to detect the current code page Windows is using and then do the encoding.
many thanks for your hints so far. I will do some tests over the weekend
Hi, second dev here. Thanks for the pointer on codepage!
I've looked on my Windows instance with CHCP, and in returns 65001 - which is UTF8. I've set this (still beta?) feature like here: https://superuser.com/a/1435645 Done that mainly to get all the correct chars in windows console - although said to be sometimes problematic, i've found no issues so far (i even forgot having set that)
That being said, current app starts w/o any problems here, having some unicode chars in path!
If we fixate that now to one of the many (https://learn.microsoft.com/en-us/windows/win32/intl/code-page-identifiers), will it produce more problems for others, than it solves? Eg will that break my UTF8 installation now? (Will test if mlaggner has a build ready)
@timob thanks for your hints! combining your approach with the results from @myron0815 lead me to the following doc https://github.com/MicrosoftDocs/windows-dev-docs/blob/docs/hub/apps/design/globalizing/use-utf8-code-page.md
According to the document from Microsoft we're able to set the manifest entry:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<assembly manifestVersion="1.0" xmlns="urn:schemas-microsoft-com:asm.v1">
<assemblyIdentity type="win32" name="..." version="6.0.0.0"/>
<application>
<windowsSettings>
<activeCodePage xmlns="http://schemas.microsoft.com/SMI/2019/WindowsSettings">UTF-8</activeCodePage>
</windowsSettings>
</application>
</assembly>
which will force UTF-8 mode for the whole application using Windows Version 1903 (May 2019 Update) or newer. We've tested this on a Windows machine using Windows-1252 and on a machine with UTF-8 and both worked as expected.
So from our point of view, we will not need to encode the classpath (and arguments), we rather set the manifest values and will force our users to use a Windows Version >= 1903 or do not use unicode characters in their classpath.
Your help is really appreciated (you are the hero of the day for us :D)
Hey nice! Thats interesting, around how UTF-8 works on Windows. I'm recently back developing on Windows so I'm learning along the way.
I saw that loading the JVM from a path containing an unicode character (e.g.
Séries
) fails. If I change the path name to use only ASCII character, loading succeeds.As far as I found out, the Windows API
LoadLibrary
has two different flavours which are used on compile time:LoadLibraryA
(ASCII variant - default) andLoadLibraryW
(unicode variant).After researching for a few more hours I say that I need to set a compiler variable tho force the usage of the unicode variant which I did via (
windows.go
):which leads to some compiler errors. After fiddling a bit more with the parameters for LoadLibrary (which does not accept a
C.char
in the unicode variant) I got it compiling again, but the result did not work either.Did I miss anything else or am I simply wrong?