Scirra / Construct-bugs

Public bug report submissions for Construct 3 and Construct Animate. Please read the guidelines then click the 'Issues' tab to get started.
https://www.construct.net
104 stars 83 forks source link

Hindi Sprite Font issue #5622

Closed elmanuelv closed 2 years ago

elmanuelv commented 2 years ago

Problem description

Hi! I'm developing a game about a buddhist monk, and since most buddhist live in India, I'm having the game translated to Hindi. I already have all the translations and I'm currently working on placing them on the game.

The game is in pixel art, so for the texts, I use a Sprite Font and, since hindi has completely different characters, I made a different one for hindi.

These are the characters (I haven't completed it since I assume the rest will have issues too) 01234567 89:?!.,_ अआएईऍऎऐइ ओऑऒऊऔउबभ चछडढफफ़गघ ग़हजझकखख़ल ळऌऴॡमनङञ णऩॐपक़रऋॠ ऱसशषटतठद थधड़ढ़वयय़ज़ संसहेजें

This is the sprite imagen

And this is what the Sprite Font object gives me when I put the same characters in order imagen

As you can see, some of the characters are misplaced and some don't even show.

I hope it's not a complicated issue, since it's probably just a font/language problem and that you can fix it soon, because we'll be porting and publishing the game on consoles this year :)

Attach a .c3p

This is a project that only has the Sprite Font object and a Sprite on its side, showing what it should look like. https://www.dropbox.com/s/3kcp0oou1v2a8ha/Hindi%20font.c3p?dl=0

Steps to reproduce

The problem is visible on the editor.

Observed result

The font is not placing some characters correctly.

Expected result

The Sprite Font on the right should show the same characters from the sprite on the left.

AshleyScirra commented 2 years ago

This turns out to be a complicated problem involving Unicode representation of some characters.

The character "जें" demonstrates the problem - this is actually split in to three characters: 'ज', 'े', 'ं'. I am not sure how that will appear on all systems as it's an unusual sequence of unicode characters. I'm afraid I don't know any Hindi whatsoever, but it looks like a base character followed by two additional characters that combine with the base character. While most of the character set do correctly split in to their own individual characters, there are a few on the last row that use these kinds of combinations, and don't appear to have an individual character defined by Unicode. That means it counts as three separate characters that each map to their own images, rather than treating it as one character.

I did a quick bit of research and it sounds like what we really want is to split according to the idea of "grapheme clusters". I found a library that can do this, and implemented that for processing SpriteFont (and a few other things like word wrap, which has a similar problem it turns out). It looks like it fixes it, but I'm not sure of the compatibility implications of this, so I'll save the fix for the next release cycle. So the first beta release after the next stable release should have the fix, which will be a few weeks away.

elmanuelv commented 2 years ago

Nice! Thank you so much :)

Yes hahah sounds complicated, I don't know any Hindi either so it's been a real challenge to do this.

Looking forward to the update! I'll let you know if it works ;)

Best regards,

Manuel

El mié, 13 abr 2022 a las 9:00, Ashley (Scirra) @.***>) escribió:

Closed #5622 https://github.com/Scirra/Construct-3-bugs/issues/5622.

— Reply to this email directly, view it on GitHub https://github.com/Scirra/Construct-3-bugs/issues/5622#event-6427175184, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC4ILPV2Q5B2TKGKGNDSY5TVE3AOLANCNFSM5TIETPWQ . You are receiving this because you authored the thread.Message ID: @.***>