d0k3 / GodMode9

GodMode9 Explorer - A full access file browser for the Nintendo 3DS console :godmode:
GNU General Public License v3.0
2.15k stars 195 forks source link

UTF-8 filenames break GodMode9 #368

Closed NWPlayer123 closed 6 years ago

NWPlayer123 commented 6 years ago

I was trying to extract an NDS rom (Nintendogs Dachshund and Friends), which has a file called USAまだっす.txt in the root, it gets confused and refuses to copy the rest to my SD

d0k3 commented 6 years ago

I'll look into. Guess that's a DS specific problem, it was already solved for 3DS files.

urherenow commented 6 years ago

Not that I know what I'm doing, but does this commit on fastboot3ds help at all? https://github.com/derrekr/fastboot3DS/commit/c23b0e1cc71f1d28e3fb7cbf01197c4cdcb05e7d

d0k3 commented 6 years ago

@urherenow - that's only about the FatFS lib, and GM9 is already up to date with that. That can't be the problem here.

@NWPlayer123 - I tried it, and, yup, it's weird. GM9 should be perfectly capable of handling UTF-8 in copy operations. Looking into it further...

d0k3 commented 6 years ago

Okay, after some further looking into... GM9 is perfectly capable of handling standard UTF-8 filenames and even UTF-16 - try the 3DS Fire Emblem Fates games for example (actually, try one of them, just to make sure theres no problem on your end).

Now, as for that one... what's that at the beginning? Emoji letters? Unsure if FAT can even handle that. Can you name a tool that would actually extract that file from the DS rom properly?

NWPlayer123 commented 6 years ago

like I said, it's called USAまだっす.txt, unicode "USA", tinke happily extracts it @d0k3

d0k3 commented 6 years ago

Okay, keeping you up to date on my progress. This is not UTF-8, it's Shift-JIS, which is bad. A proper copy to FAT would mean proper UFT-8 <-> Shift-JIS conversion, and that's not exactly trivial. Maybe I'll look into just a workaround, seeing how this is not exactly common.

ghost commented 6 years ago

I think that a better fallback is to screw all Unicode and display raw bytes with CP437.

ParzivalWolfram commented 6 years ago

Yeah, Shift-JIS is a pain. I'd recommend just "converting" by either filename truncation or just allowing it to become mojibake.

ghost commented 6 years ago

Seriously why they are still using an encoding other than UTF-8? In China, the same problem with GB2312 and GBK. NIH syndrome at its finest.

knight-ryu12 commented 6 years ago

JP here, because Windows.

ParzivalWolfram commented 6 years ago

Windows usually has problems with other languages due to avoiding Unicode. Locales are a pain in the ass...

d0k3 commented 6 years ago

Explanation: Yup, this is not perfect. What it does is, it replaces any filename in Shift-JIS with UTF-8 "日本語[position_in_hex].sjis" - that way at least all filenames are unique and can be copied to the FAT filesystem.

The japanese symbols mean "Japanese", in case you wonder.

And, no, a Shift-JIS converter is 100% out of question.