stella-emu / stella

A multi-platform Atari 2600 Emulator
https://stella-emu.github.io
GNU General Public License v2.0
625 stars 111 forks source link

Stella doesn't load ROM with special characters in name #751

Open thrust26 opened 3 years ago

thrust26 commented 3 years ago

Stella seems to have problems with certain special chars (non-UTF8) in ROM names (File not found/readable).

E.g. the 'ż' in the attached file is not liked. Gruniożerca.zip

sa666666 commented 3 years ago

Works for me in Linux at least. The filename is not fully displayed in the launcher, since the character set doesn't contain that character. But Stella is still able to load it. So if this a Windows-only thing, I suspect it's somewhere in src/windows/FSNodeWindows.

thrust26 commented 3 years ago

Thanks, I was already wondering why James hadn't noticed that.

sa666666 commented 3 years ago

I will try to test on some other systems later today.

thrust26 commented 3 years ago

In the project settings "Multi-Byte Character Set" (MBCS) is enabled, not Unicode (which doesn't compile).

Could that be the problem?

sa666666 commented 3 years ago

It's possible, but I've never really played with it. What happens when you force the Unicode #ifdef to be compiled? Is this the part that won't compile?

thrust26 commented 3 years ago

No, lots of stuff in FilesystemNodeWINDOWS is breaking, especially string.c_str() isn't accepted anymore (LPCWSTR is expected). I will try to fix that...

SerialPortWINDOWS has the same problems.

sa666666 commented 3 years ago

Yeah, all that code is very ASCII-centric. I don't know that it's ever been tested with other character sets.

sa666666 commented 3 years ago

Incidentally, this is why I'd hoped to move to the std::filesystem stuff in C++17, to eliminate a big chunk of this platform-specific code. But it seems not all compilers fully support it yet.

thrust26 commented 3 years ago

Grrr, got the code compiled with Unicode, but it still doesn't load the ROM. The 'ż' is converted to a simple 'z'.

sa666666 commented 3 years ago

I think we need to drill down to see exactly what is failing here. Is it on our side, or is it happening somewhere in one of the functions that Windows is using?

Sorry I can't help more with this; currently recording lectures for next semester. Maybe we have to bump this one to post 6.5.

thrust26 commented 3 years ago

No hurry, this can wait. I think Windows does it correct, but Stella partially converts special chars. Either we keep the special chars completely or we ignore them completely. Currently it seems like a mix of it.

thrust26 commented 3 years ago

I wonder what Linux is doing. Since we are using strings, it cannot store the special chars. Which means it must match 'ż' with 'z' somehow.

thrust26 commented 3 years ago

Meanwhile I found a workaround. In addFile() we can use the 8.3 filename for _path and the long filename for _displayName.

@sa666666 What do you think?

thrust26 commented 3 years ago

@sa666666 This fix would allow loading files with specials chars.

BUT when saving a file, the _path name is used if no entry in our DB exists for the ROM. This causes ugly names like 'Grunio~01.pro'. We would

Actually I don't understand why the code sometimes uses the _path and sometimes the _displayName variables. Sometimes _displayName is even overwritten by the name derived from _path.

sa666666 commented 3 years ago

I have some ideas, but no time right now. I will come back to this for the next release.

sa666666 commented 2 years ago

I plan to work on this one next. Ideally, there's a way to turn all special characters into their ASCII equivalent. That's the best we can do, since Stella is essentially ASCII-only; the fonts, font renderer, etc all depend on this.

thrust26 commented 2 years ago

Might become complicated. How about replacing them all e.g. with '_'?

But I suppose both ideas do not help when it comes to file access.

sa666666 commented 2 years ago

I'm not looking to hand-code a parser for this. I'm hoping to find something in C++11 or above locale stuff to handle it. Basically, any umlauts, graves, etc would be removed and converted to the 'base' ASCII character. I don't know if this is even possible. Barring that, converting to '_' might work too.

thrust26 commented 2 years ago

Remember that it is not so much of a display problem, but a file access problem.

sa666666 commented 2 years ago

Right. For me in Linux, it is strictly a display issue. The file load part is fine. So this could get complicated.

thrust26 commented 2 years ago

Using 8.3 filenames might help, but a user can disable it. And probably Microsoft will eventually do so too.