Open trap15 opened 8 years ago
Some proposals:
This works, but it terribly unreadable.
GAME( 1992, wwmarine, 0, segac2, wwmarine, segac2_state, bloxeedc, ROT0, "Sega", "Waku Waku Marine \u308f\u304f\u308f\u304f\u30de\u30ea\u30f3", 0 )
GAME( 1991, soniccar, 0, segac2, soniccar, segac2_state, bloxeedc, ROT0, "Sega", "Waku Waku Sonic Patrol Car \u308f\u304f\u308f\u304f\u30bd\u30cb\u30c3\u30af\u30d1\u30c8\u30ab\u30fc", 0 )
This seems much better!
GAME( 1992, wwmarine, 0, segac2, wwmarine, segac2_state, bloxeedc, ROT0, "Sega", "Waku Waku Marine わくわくマリン", 0 )
GAME( 1991, soniccar, 0, segac2, soniccar, segac2_state, bloxeedc, ROT0, "Sega", "Waku Waku Sonic Patrol Car わくわくソニックパトカー", 0 )
I'd like to put it in two different fields, so something like:
GAME( 1992, wwmarine, 0, segac2, wwmarine, segac2_state, bloxeedc, ROT0, "Sega", "Waku Waku Marine", "わくわくマリン", 0 )
Or for fields where they're the same, maybe something like
GAME( 1992, wwmarine, 0, segac2, wwmarine, segac2_state, bloxeedc, ROT0, "Sega", "Waku Waku Marine", "", 0 )
What about in the case of accented latin text?
Could it be like this?
COMP( 1972, patinho, 0, 0, patinho_feio, patinho_feio, patinho_feio_state, patinho_feio, "Escola Politécnica - Universidade de São Paulo", "", "Patinho Feio", "", MACHINE_NO_SOUND_HW | MACHINE_NOT_WORKING)
Bear in mind that both the "company" and the "full name" fields may have unicode characters... We may have to use some macros to make it nicer and cleaner. I can understand the idea of using null strings but I'd preffer to not have them visible in the source at all.
As things stands, I'm prone to think that the current MAME categorization system isn't in any way adeguate for 2016 standards. Random examples:
Bottom line: I don't like software list / XML system either for personal tastes (namely being unreadable by human eye in raw format), but it certainly treats these "optional" things like non-romaji alphabets just well.
@angelosa Could you please open a new issue on GitHub specifically about the broader topic of improving the way MAME stores metadata in general? I think you've got valid points and I would add some more comments on that, but I'd preffer to keep this issue focused on the unicode strings and have all other discussion going on in a separate issue, for the sake of clarity and better organization of the current issues at hand.
Could use compound literals?
GAME_ADD((GameInfo){
.name="わくわくマリン"
.name_romanized="Waku Waku Marine",
.year=1992,
[etc...]
})
In which case defaults end up being 0/NULL. More readable than XML, and keeps it in the driver.
The file src/mame/drivers/cps2.cpp has got almost 300 GAME entries... It would be good to keep all metadata in a single line if possible. But sometimes, indeed the lines get really long!
A tool called pyftsubset in the fonttools project (https://github.com/behdad/fonttools/) may be able to generate the needed Noto font subsetting that I suggested on IRC earlier today for unicode metadata strings in MAME.
Noto is a libre font family being developed by Google to have a very wide glyph coverage. So it is essentially a font designed to fulfill needs of ambitious multi-language projects like this.
But the problem is that such a font family has very large file sizes. So the idea is that we should generate a minimal font subset that contains only the glyphs needed. Before packaging a new MAME release, we would have to run an automatic subsetting script that would list all unicode codepoints of glyphs used in metadata strings declared in MAME's codebase and then the generated minimal font file would be added as a program resource and loaded by default in the MAME ui.
This would guarantee that all metadata would be properly rendered in our user interface.
oh! And by the way... here's the Noto libre font project website: https://www.google.com/get/noto/
Using romanizations for game metadata is fairly lossy and at the very least it's poor documentation. I propose there should be a secondary title string for the original language, keeping the romanized version.
Ideally, the developer could also use UTF-8 to write it in, instead of using escaped unicode. For this, srcclean needs to stop horribly destroying unicode for no reason.