Open sotnikov-s opened 4 years ago
I think that it just means that the file encoding is wrong. On osx:
file --mime-encoding BraceCode.java BraceCode.java: iso-8859-1
Fix with:
iconv -f iso-8859-1 -t utf-8 < BraceCode.java > BraceCode.java
Note that iconv
first truncates the output file, so that piping its output to the same path as its input results in an empty file. I'm currently using this script to fix the source code:
#!/usr/bin/env bash
set -eo pipefail
shopt -s failglob
FILES="PatentDocument/src/main/java/gov/uspto/patent/doc/xml/BraceCode.java
PatentDocument/src/main/java/gov/uspto/patent/model/NplCitation.java
PatentDocument/src/main/java/gov/uspto/patent/model/DocumentId.java
PatentDocument/src/main/java/gov/uspto/patent/doc/greenbook/DotCodes.java
PatentDocument/src/main/java/gov/uspto/patent/doc/pap/FormattedText.java
PatentDocument/src/main/java/gov/uspto/patent/doc/sgml/FormattedText.java"
TMPFILE=`mktemp`
for FILE in ${FILES}
do
iconv -f iso-8859-1 -t utf-8 < ${FILE} > ${TMPFILE} &&
mv -f ${TMPFILE} ${FILE}
done
# Fix incorrect type signature for constructor
sed -i '' 's/, "<XX>"//g' PatentDocument/src/main/java/gov/uspto/tm/doc/brs/TmBrs.java
that was nice, the encoding error got resolved, but another one occurred:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project PatentDocument: Compilation failure: Compilation failure:
[ERROR] PatentPublicData/PatentDocument/src/main/java/gov/uspto/tm/doc/brs/TmBrs.java:[32,29] cannot find symbol
[ERROR] symbol: class DateUtil
[ERROR] location: package gov.uspto.common.text
[ERROR] PatentPublicData/PatentDocument/src/main/java/gov/uspto/tm/doc/brs/TmBrs.java:[228,32] cannot find symbol
[ERROR] symbol: variable DateUtil
[ERROR] location: class gov.uspto.tm.doc.brs.TmBrs
seems like it imports a nonexistent class cause there is no mention of both DateUtil
and its called toDateTimeISO
method throughout the whole project
Sorry about the missing DataUtil java class it's now checked in. My IDE seems to not to be bothered by any character encoding issues, but I will continue to look into it.
thanks, with the above-mentioned script and the newest version the build succeeds
Thanks for the quick fix of https://github.com/USPTO/PatentPublicData/issues/99 I pulled the new version from master and tried to rebuild the project by running
mvn clean package -DskipTests=true
but got a number of errors likeThe full text of the build is here output.pdf Are you encountering the same problem?