Dushistov / flapigen-rs

Tool for connecting programs or libraries written in Rust with other languages
BSD 3-Clause "New" or "Revised" License
775 stars 59 forks source link

java/jni: Marshal strings as utf16 #443

Closed ijc closed 1 year ago

ijc commented 1 year ago

The StringUTF functions in the JNI API accept or produce a modified UTF-8 encoding which differs from proper UTF-8.

When converting jstring to String I believe this in practice moves an allocation from GetStringUTFChars to String::from_utf16 (assuming GetStringUTFChars always allocates since it must convert UTF-16 to modified-UTF-8 and GetStringChars rarely or never needs to allocate since it can expose an existing internal UTF-16 buffer).

When converting jstring to &str we must unfortunately add an allocation of a String.

When converting &str or String to jstring we must unfortunately add a second allocation of a Vec<c_ushort> of UTF-16 chars in addition to the one which (I assume) both NewStringUTF and NewString will be making.

I am not super familiar with JNI and in particular memory allocation/management.

Fixes #441.

ijc commented 1 year ago

@Dushistov I see the CI failures but they look like infra issues not problems with the changes here. Is there anything I need to do? I'll keep an eye on master and rebase if I see a CI related change.

FWIW I've been running JAVA_HOME=/usr/lib/jvm/default-java python3 ci_build_and_test.py --java-only-tests on my Linux dev machine and it is passing.

Dushistov commented 1 year ago

I see the CI failures they look like infra issues

@ijc , yes some network problems with network access to build cache.

ijc commented 1 year ago

@ijc , yes some network problems with network access to build cache.

:+1: . Is there anything I can do to help?

On an unrelated note, I can also now confirm that this fixes the crash our QA team were seeing.

ijc commented 1 year ago

@Dushistov it looks from https://github.com/Dushistov/flapigen-rs/actions/runs/4034498006 like CI might be fixed?

ijc commented 1 year ago

@Dushistov I've rebased and applied some updates to the test expectations which I had somehow missed before.

ijc commented 1 year ago

I spotted in the CI logs:

warning: function `from_jstring_std_string` is never used
 --> D:\a\flapigen-rs\flapigen-rs\target\x86_64-pc-windows-msvc\debug\build\flapigen-6718934b1708eadc\out/jni-include.rs:1:17534
  |
1 | ...(self . env , self . string , self . chars) ; } } } fn from_jstring_std_string (js : jstring , env : * mut JNIEnv) -> String { if ! js...
  |                                                           ^^^^^^^^^^^^^^^^^^^^^^^
  |

I pushed the addition of the missing #[allow(dead_code)].

ijc commented 1 year ago

@Dushistov this is now green in CI. Do you need anything else from me before you'd be happy to merge?

ijc commented 1 year ago

@Dushistov sorry to bug you, can we merge this one?

Dushistov commented 1 year ago

@ijc

Sorry for the delay, been busy at work. In the next couple of days I will review the code.

ijc commented 1 year ago

I suppose in this case our utf-8 always correct,

Rust always provides valid UTF-8 indeed, but the argument to Java's NewStringUTF is "Modified UTF-8" and not proper UTF-8, I tried it and it barfs on proper UTF-8 for the cases where "modified" applies.

Anyhow, thanks for merging! Do you plan to cut a new release in the near future?

Dushistov commented 1 year ago

@ijc

Anyhow, thanks for merging!

Thanks for PR!

Do you plan to cut a new release in the near future?

Yes, there is plan for release in near future. I am going to update all dependencies for new major release if it exists, and then make new release.