rmraya / OpenXLIFF

An open source set of Java filters for creating, merging and validating XLIFF 1.2, 2.0 and 2.1 files.
https://www.maxprograms.com/products/openxliff.html
Eclipse Public License 1.0
65 stars 17 forks source link

compilation: unmappable character for encoding US-ASCII #14

Closed ghost closed 2 years ago

ghost commented 2 years ago

Hello,

I had an issue compiling OpenXLIFF from master with ant, but I managed to fix it by setting the javac compiler option below in build.xml. Would you like me to submit a pull request?


Problem

# ant
  ...
compile:
    [javac] Compiling 112 source files to /root/OpenXLIFF-master/bin
    [javac] /root/OpenXLIFF-master/src/com/maxprograms/converters/xml/Xml2Xliff.java:842: error: unmappable character (0xC2) for encoding US-ASCII
    [javac]             if (" \u00A0\r\n\f\t\u2028\u2029,.;\":<>?????!()[]{}=+/*\u00AB\u00BB\u201C\u201D\u201E\uFF00"
    [javac]                                                     ^
    [javac] /root/OpenXLIFF-master/src/com/maxprograms/converters/xml/Xml2Xliff.java:842: error: unmappable character (0xBF) for encoding US-ASCII
    [javac]             if (" \u00A0\r\n\f\t\u2028\u2029,.;\":<>?????!()[]{}=+/*\u00AB\u00BB\u201C\u201D\u201E\uFF00"
    [javac]                                                      ^
    [javac] /root/OpenXLIFF-master/src/com/maxprograms/converters/xml/Xml2Xliff.java:842: error: unmappable character (0xC2) for encoding US-ASCII
    [javac]             if (" \u00A0\r\n\f\t\u2028\u2029,.;\":<>?????!()[]{}=+/*\u00AB\u00BB\u201C\u201D\u201E\uFF00"
    [javac]                                                        ^
    [javac] /root/OpenXLIFF-master/src/com/maxprograms/converters/xml/Xml2Xliff.java:842: error: unmappable character (0xA1) for encoding US-ASCII
    [javac]             if (" \u00A0\r\n\f\t\u2028\u2029,.;\":<>?????!()[]{}=+/*\u00AB\u00BB\u201C\u201D\u201E\uFF00"
    [javac]                                                         ^
    [javac] 4 errors

Fix

in build.xml:

<javac srcdir="src" destdir="bin" classpathref="OpenXLIFF.classpath" modulepathref="OpenXLIFF.classpath" includeAntRuntime="false">
        <compilerarg line="-encoding utf-8" />
</javac>

Environment

# uname -r
5.10.61
# cat /etc/debian_version
11.0
# java --version
openjdk 11.0.12 2021-07-20
OpenJDK Runtime Environment (build 11.0.12+7-post-Debian-2)
OpenJDK 64-Bit Server VM (build 11.0.12+7-post-Debian-2, mixed mode, sharing)
# ant -version
Apache Ant(TM) version 1.10.11 compiled on July 10 2021
# locale
LANG=
LANGUAGE=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
rmraya commented 2 years ago

Your platform encoding is set to US-ASCII. Default encoding for Linux is expected to be UTF-8.

I see you are using Debian built-in tools. Can you try using Java 11 from https://adoptopenjdk.net/ with ant from https://ant.apache.org/ ?

ghost commented 2 years ago

My initial install of Ant was from the Apache website. I followed the Debian install instructions on the adoptopenjdk website, and confirmed that the same error was present.

I used the setup from this SO answer to check the file encoding in Java:

# javac PrintCharSets.java && java PrintCharSets
file.encoding=ANSI_X3.4-1968
Charset.defaultCharset=US-ASCII
InputStreamReader.getEncoding=ASCII
# javac PrintCharSets.java && LC_ALL=en_US.UTF-8 java PrintCharSets
file.encoding=UTF-8
Charset.defaultCharset=UTF-8
InputStreamReader.getEncoding=UTF8

The Java encoding seems to depend on the system locale in Linux, but I don't think it can be assumed that the system locale supports unicode. In my case, for example, I was building in a Docker container, which is why the system locale wasn't configured. However, the compile option in my previous comment overrode that setting.

Would it be possible to specify unicode encoding, either via the compile options (as in my previous comment) or in the README?

rmraya commented 2 years ago

Updated build.xml with your proposed change