The-Acronym-Coders / BASE

The Central Repo For The B.A.S.E Project
16 stars 13 forks source link

B.A.S.E.'s BaseFileUtils is writing file using system default charset while reading it using UTF-8 #108

Closed 3TUSK closed 5 years ago

3TUSK commented 5 years ago

Synopsis

BaseFileUtils uses potentially different charset for reading and writing files (specifically the FileUtils.writeStringToFile(file, string, Charset.defaultCharset()); part), which may cause issues when dealing with multiple languages. https://github.com/The-Acronym-Coders/BASE/blob/2ef4890b60c5658a2c16ae36bb09bc7b2ebb0321/src/main/java/com/teamacronymcoders/base/util/files/BaseFileUtils.java#L133-L152

Reproduction

@wormzjl has the following setup:

  1. He is using Windows OS with (most likely) a European locale. The exact locale is still unknown so far.
  2. He was editing zh_cn.lang using the ${game_dir}/resources folder-based resourcepack which is provided by B.A.S.E., for his items created via ContentTweaker.
  3. He saves files using UTF-8 encoding.
  4. After game launches, the file becomes gibberish. It is suspected that the fixLang functionality is the culprit: https://github.com/The-Acronym-Coders/BASE/blob/2ef4890b60c5658a2c16ae36bb09bc7b2ebb0321/src/main/java/com/teamacronymcoders/base/util/files/ResourceLoader.java#L75-L79 which in turns invokes BaseFileUtils.writeStringToFile as shown above.

Analysis

See above. TLDR: asymmetrical I/O handling to the extent of charset.

Proposed fix

FileUtils.writeStringToFile(file, string, java.nio.charset.StandardCharsets.UTF_8);

Final words

N/A

SkySom commented 5 years ago

I thought I fixed this. Guess I only did the reading as UTF-8, not writing...