bbbradsmith / nsfplay

Nintendo NES sound file NSF music player
https://bbbradsmith.github.io/nsfplay/
277 stars 42 forks source link

nsfplay unable to read Japanese path/file names? #76

Closed etoubleh closed 1 year ago

etoubleh commented 1 year ago

Title. I get an error if I try to load any .nsf files that have a Japanese name or are in a folder with a Japanese name.

bbbradsmith commented 1 year ago

I can duplicate this problem. I'm surprised because I thought I had used it with Japanese filenames in the past...

However, it seems all of the filename handling was never updated to unicode, so this could be a little bit tricky to solve. I haven't tested this at the moment, but I suspect that if the Windows locale is Japanese, it has a shift-jis codepage by default for non-unicode applications, which I think would permit it to load these filenames. (Not sure, though, and locale shift is inconvenient so I won't test it right now.)

Anyway, this is an important problem to fix. I'll try to get to it soon.

bbbradsmith commented 1 year ago

As an attempt at a quick stopgap solution, I tried setlocale with both ".932" (shift_jis) and ".UTF-8", but the former isn't sufficient for this, and the latter I'm not sure but it would probably require dropping Windows 7 support. :S So anyway... probably needs the bigger/full solution.

bbbradsmith commented 1 year ago

Okay I think I have this fixed in bbefb0a which should no use windows native unicode filenames (wchar_t) at all endpoints (open file, save file, etc.) and internally stores them as utf8.

Can you test the latest artifact and confirm is this is a fix for you?

https://ci.appveyor.com/project/bbbradsmith/nsfplay/branch/master/artifacts

c-yan commented 1 year ago

In general, the presence or absence of a backslash character in the second byte determines whether or not a problem occurs in Japanese, so the fact that a particular filename works cannot be used to determine whether or not there is a problem.

@etoubleh 全部駄目だということは、ディレクトリ名に駄目文字が含まれている気がします. 具体的に駄目だったパスを提示できますか? 追試したく.

bbbradsmith commented 1 year ago

In general, the presence or absence of a backslash character in the second byte determines whether or not a problem occurs in Japanese, so the fact that a particular filename works cannot be used to determine whether or not there is a problem.

I don't understand this response. Have you tested the latest artifact and found a flaw with it, or is this a general statement about something else? (If you haven't tested it, could you try it and report back?)

I don't really understand what you mean by a "backslash character in the second byte". The problem I understood is simply that the program had never been updated to use modern unicode filenames, so the names it was getting from the old 8-bit character interfaces were being mangled (in general any character not on the local 8-bit code page would come through the API as a ? byte).

My solution was to use the native unicode file operations when interacting with Windows, and then otherwise filenames are now stored as UTF8 internally so I can keep the old data structures. The UTF8 is expanded back to Windows Unicode when interacting with files again.

I'm not sure I understand any situation where there is a second byte with a backlash? What does this mean?

While revising this, I also added support for unicode display of NSF metadata, titles, etc. with a fallback to Shift-JIS if the data isn't valid UTF8, which allows many older NSFs with Japanese titles to display correctly now, without having to use a Japanese locale setting to get the shift-jis code page.

c-yan commented 1 year ago

I don't understand this response. Have you tested the latest artifact and found a flaw with it, or is this a general statement about something else? (If you haven't tested it, could you try it and report back?)

I commented on the following sentence to explain that I'm not surprised because even if it had worked, it would not have worked if the second byte of the Japanese filename did not have a backslash in it.

I'm surprised because I thought I had used it with Japanese filenames in the past...

In addition, I have not tested it because I have not yet received a response to my request to give me the actual path to test it in Japanese.

bbbradsmith commented 1 year ago

In addition, I have not tested it because I have not yet received a response to my request to give me the actual path to test it in Japanese.

The bug affected any filename or path containing a character outside the local code page for legacy 8-bit character paths. I don't think the bug was specific to Japanese filenames, but more generally any unicode filename characters that aren't available for that locale. etoubleh did not say what locale they are using, mine is currently set to Canada/English.

If you want a specific example, trying to open "日本語.nsf" or "日本語\file.nsf" both failed on my locale. If those weren't failing on a Japanese locale, something like "☂.nsf" should instead fail. Any valid unicode filename that wasn't possible in the older pre-unicode file system for that locale should fail. The fix was to implement modern unicode path support.

When I said that I believed I had seen it work in the past, I think I had tested it using a Japanese locale, which I believe supported shift-JIS filenames without unicode. However, it is not something I have verified, and really the switch to unicode should solve the problem for all locales. It is really only an academic question whether it may have worked in the past. I think "日本語.nsf" did work with a Japanese locale in the old version, because it could be represented in shift-JIS. However, if I remember this incorrectly, it's not really important because proper unicode support is now available in the new version.

So, if you would like to test something, just rename some folders and/or NSF files with various unicode characters in them and see if they will open or not. If you can find something that still fails, that would be helpful to fix, and even if you don't, it would be helpful to have someone else's confirmation that the fix works.

...even if it had worked, it would not have worked if the second byte of the Japanese filename did not have a backslash in it.

Can you clarify what you mean by this? Why would a backslash matter, and why would it have not be in the second byte? I don't understand what either of those things would have to do with the bug.

c-yan commented 1 year ago

The above is a general statement, but I just tested a path with a "ソ" including a backslash in the second byte, and the problem did not occur even in 2.5. I guess I can't follow up the test without knowing the actual path.

c-yan commented 1 year ago

From your explanation, it sounds like it had nothing to do with the common generalizations I had thought. I'm sorry, please forget it.

bbbradsmith commented 1 year ago

Thank you for testing with 2.5.

bbbradsmith commented 1 year ago

This should be fixed in 2.6 release