bytedeco / javacpp

The missing bridge between Java and native C++
Other
4.43k stars 576 forks source link

Fix use of deprecated std::char_traits<unsigned short>::length #753

Open codeinred opened 3 months ago

codeinred commented 3 months ago

Some versions of clang - particularly the one on MacOS - emitted the following warning when compiling javacpp:

`jniNativeLibrary.cpp:314:57: warning: 'char_traits' is deprecated: char_traits for T not equal to char, wchar_t, char8_t, char16_t or char32_t is non-standard and is provided for a temporary period. It will be removed in LLVM 18, so please migrate off of it. [-Wdeprecated-declarations] return JavaCPP_createStringFromUTF16(env, ptr, std::char_traits::length(ptr));

This commit fixes that warning by adding a function JavaCPP_stringLength for strings of unsigned short.

Related Pull Requests & Issues

748 was added with the same goal of fixing issue, however the concern raised there was that char16_t is not available pre-c++11. This PR aims to fix that issue by instead providing a clean implementation in standard C++.

If merged, this PR should resolve Issue #710.

Performance Implications

There should be no performance difference for this change versus using std::chair_traits<unsigned short>::length, because it appears that compilers do not handle std::char_traits<unsigned short>::length as a special case in the way they do strlen.

See this godbolt link,l:'5',n:'0',o:'C%2B%2B+source+%231',t:'0')),k:50,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:g132,filters:(b:'0',binary:'1',binaryObject:'1',commentOnly:'0',debugCalls:'1',demangle:'0',directives:'0',execute:'1',intel:'0',libraryCode:'0',trim:'1',verboseDemangling:'0'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:2,lang:c%2B%2B,libs:!(),options:'-O3+-Wall+-std%3Dc%2B%2B17',overrides:!(),selection:(endColumn:12,endLineNumber:21,positionColumn:1,positionLineNumber:1,selectionStartColumn:12,selectionStartLineNumber:21,startColumn:1,startLineNumber:1),source:1),l:'5',n:'0',o:'+x86-64+gcc+13.2+(Editor+%231)',t:'0'),(h:output,i:(compilerName:'x64+msvc+v19.latest',editorid:1,fontScale:14,fontUsePx:'0',j:2,wrap:'0'),l:'5',n:'0',o:'Output+of+x86-64+gcc+13.2+(Compiler+%232)',t:'0')),header:(),k:50,l:'4',n:'0',o:'',s:0,t:'0')),l:'2',n:'0',o:'',t:'0')),version:4) for reference.

Assembly output is identical between the standard library implementation, and the one provided in this commit.

Implementation:

#include <memory>
#include <string>

/// Old implementation
size_t char_traits_length(unsigned short const* s)
{
    return std::char_traits<unsigned short>::length(s);
}

/// New implementation
size_t JavaCPP_stringLength(unsigned short const* s)
{
    for(size_t i = 0; ; ++i) {
        if (s[i] == 0) return i;
    }
}

Assembly:

char_traits_length(unsigned short const*):
        xor     eax, eax
        cmp     WORD PTR [rdi], 0
        je      .L4
.L3:
        add     rax, 1
        cmp     WORD PTR [rdi+rax*2], 0
        jne     .L3
        ret
.L4:
        ret
JavaCPP_stringLength(unsigned short const*):
        xor     eax, eax
        cmp     WORD PTR [rdi], 0
        je      .L7
.L8:
        add     rax, 1
        cmp     WORD PTR [rdi+rax*2], 0
        jne     .L8
.L7:
        ret
saudet commented 3 months ago

Maybe we could just put an ifdef else endif in there that uses char16_t for C++11 compilers and keep using unsigned short if not?

HGuillemet commented 3 months ago

Either using a custom null-seeking function or an ifdef calling either std::chair_traits<unsigned short>::length or `std::chair_traits::length seem ok for me, since @codeinred established that there is no performance loss and the generated assembly is the same.

BTW, I see usage of wcslen in StringAdapter when passed a short * or int *. That looks wrong.

saudet commented 3 months ago

BTW, I see usage of wcslen in StringAdapter when passed a short or int . That looks wrong.

That's just for platform dependent support of wchar, not UTF-16 or UTF-32