Closed Kaffeine closed 10 months ago
Amazing thanks for working on this.
I was delaying the addition of the skin files as long as possible. Even considering adding it to a follow up pr in ddnet. Just to keep this way too big pr smaller. There are so many code changes I did not want to spam in a bunch of json changes. But whatever now that you did it I guess let's go.
Your changes look good to me. I would merge it as is. And then squash it into one commit. Then I will keep rebasing those two commits (mine and yours) in my ddnet pr. I will probably add a third commit for my new fixes on top. It will be all a bit more messy but should work.
Are you cool with me merging this? Or are you still working on it?
I fixed what I could at the given time :-) If I'll have time to do more work, I'll open a new PR. Feel free to squash/rebase/whatever you think is appropriate :)
I fixed what I could at the given time :-)
If I'll have time to do more work, I'll open a new PR. Feel free to squash/rebase/whatever you think is appropriate :)
We could also go for two commits. One for code and one with just the skin files. Do you wanna squash it? Otherwise I have to make up a commit message in your name which feels wrong hehe
I can't access my pc today to change the commits. I don't see why you squash everything instead of keeping individual reviewable commits (which also can be merged into DDNet individually, "on ready"). E.g. those commits are 100% atomic and can be merged at any time (I see @heinrich5991 won't want to accept unused code but if we are certain that we want 0.7 skins, those commits would be necessary sooner or later and it is pragmatic to merge them sooner to reduce the time costs for both developer and reviewer):
I think I need just a few more hours to get 0.7 skins rendered properly (right now the main issue is that I don't know how to apply DDNet refactoring to 0.7 RenderHand (https://github.com/teeworlds/teeworlds/commit/ecea6edcf70f34a6a8afb7f1652a14f457f3bc46). I'm not familiar with graphics in general and DDNet QuadContainers particularly.
Anyway, do whatever you want, e.g. squash everything into the only 0.7 support commit and drop the commit with data if you see it redundant just yet.
Okay then let's go without squash.
Thanks again c:
str_utf8_copy_num
looks suspect. Where is it used? You should usually limit strings by storage size, not some other metric.
str_utf8_copy_num
looks suspect. Where is it used? You should usually limit strings by storage size, not some other metric.
@heinrich5991, upstream commit https://github.com/teeworlds/teeworlds/commit/ec59e654c357c683e91a3f31bdc817591bb187d9
In the upstream it is used here: https://github.com/teeworlds/teeworlds/blob/master/src/game/server/gamecontext.cpp#L1084 And in this PR (now included into https://github.com/ddnet/ddnet/pull/5949) it is used in commit https://github.com/ChillerDragon/ddnet/pull/3/commits/caa82ac75d11a3a6174f560cf515001e83bef4d1
I understand the ustream idea to copy characters instead of bytes but I don't know if it is the correct way on the protocol level.
I see that probably we want to be consistent and use this also for 0.6 skin name or don't use it at all.
@heinrich5991 please let me (us?) know if we should rework the code to drop UTF-8 skin part names support. Personally I'd don't see the need in special characters, so I'd just document that the skin names must be latin1 / ASCII. On the other hand, it can lead to compatibility issues if someone went ahead and used non-ascii in (their) 0.7 skins. Not a big deal though.
Note: the current "0.7 support for DDNet" code uses hardcoded arrays suitable only for ASCII names; @ChillerDragon, maybe you can comment on this (maybe you already planned to drop UTF-8 support)?
DDNet and this 07_client
use hardcoded 24
size limit`: https://github.com/ChillerDragon/ddnet/blob/pr_07_client/src/game/client/components/skins7.h#L31
Upstream Teeworlds uses MAX_SKIN_ARRAY_SIZE
which is MAX_SKIN_LENGTH * UTF8_BYTE_LENGTH + 1 = 24 * 4 + 1 = 97
, https://github.com/teeworlds/teeworlds/blob/master/src/game/client/components/skins.h#L29.
I don't know the right answer. Limiting strings to code units on a protocol level is extremely suspect to me. However, doing it a different way would break strict compatibility with Teeworlds. Maybe we can get away with it?
I'd say only supporting ASCII is a bit harsh though. If we decide to deviate from Teeworlds, I'd treat it the same as we do things for 0.6: Keep the array sizes sane for ASCII characters but still support UTF-8 in them.
I'm currently tending towards smaller arrays but still supporting UTF-8. What do you think?
I'm currently tending towards smaller arrays but still supporting UTF-8. What do you think?
"UTF-8 support" means that we should correctly truncate the strings, and it makes sense to use the new str_utf8_copy_num()
for that.
We can use the 0.7 code almost as is, and adjust the named constants to keep 0.6 limits/buffer sizes:
MAX_SKIN_LENGTH = 23,
-MAX_SKIN_ARRAY_SIZE = MAX_SKIN_LENGTH * UTF8_BYTE_LENGTH + 1,
+MAX_SKIN_ARRAY_SIZE = MAX_SKIN_LENGTH + 1,
@Kaffeine: On the other hand, it can lead to compatibility issues if someone went ahead and used non-ascii in (their) 0.7 skins.
I just realized that 0.7 allows one more character than 0.6: it moves the null termination behind the 24
code units limit. E.g. PartNameCount > MAX_SKIN_LENGTH
leads to error., while 24 code units is fine.
https://github.com/teeworlds/teeworlds/blob/master/src/game/client/components/skins.cpp#L34
Another but similar thing here is that 0.7 also adjusted client displayed name and clan limitations, and those changes make 100% sense for me: https://github.com/teeworlds/teeworlds/blob/master/src/engine/shared/protocol.h#L93-L98
There are mods which use Clan string for a different purpose (e.g. Infclass uses it to display player class name but translated class name never fits in 12 bytes so we have to always use English name here).
Our str_copy
handles UTF-8 truncation correctly. The only value-add of str_utf8_copy_num
is the weird codepoint counting.
Can you give an example for a translation that did not fit?
Our
str_copy
handles UTF-8 truncation correctly. The only value-add ofstr_utf8_copy_num
is the weird codepoint counting.
0.7 increases the array sizes to allow the same count of non-ASCII characters, e.g. we had MAX_CLAN_LENGTH=12
in 0.6 which means 11 ASCII characters + null termination. In 0.7 the buffer is 12 * 4 + 1
bytes, and the game is trying to limit the clan name by 12 characters, no matter if it would be 12 ASCII bytes or 12 of UTF-8 4 bytes.
The issue is that str_copy()
would allow more ASCII characters than UTF-8 multibyte units; we'd still need to count code units to maintain identical displayed string length limits.
Can you give an example for a translation that did not fit?
@heinrich5991, Current 0.6 / DDNet clan name limit is 12 including \0
. 11 bytes for UTF means max 5 UTF-8 two-byte characters, and virtually all languages with non-latin alphabet need more than 5 characters. There are classes such as Mercenary, Biologist, Scientist, Engineer but even shorter translated names do not fit:
and tens more. It is easier to name what fits.
@heinrich5991: You should usually limit strings by storage size, not some other metric.
That 's the right topic: maybe we want to limit displayed strings (names, clan, etc) by characters number, even if the storage has space to fit more ASCII bytes.
E.g. teeworlds 0.7 limits the player name by 16 characters of any type/alphabet — ASCII or, say, Cyrillic or CJK. On the other hand, teeworlds 0.6 and DDNet the limit is 15 ASCII or 7 Cyrillic chars, or 5 Chinese, Japanese and Korean characters.
0.7 increases the array sizes to allow the same count of non-ASCII characters
I'm arguing that it's not good to limit by code points on the network level. Especially since code points aren't the same as user-perceived characters.
The issue is that
str_copy()
would allow more ASCII characters than UTF-8 multibyte units; we'd still need to count code units to maintain identical displayed string length limits.
I don't think code points are a good measure of display width. As an extreme example, emojis can be many code points long but still render as one character.
@heinrich5991, Current 0.6 / DDNet clan name limit is 12 including
\0
. 11 bytes for UTF means max 5 UTF-8 two-byte characters, and virtually all languages with non-latin alphabet need more than 5 characters. There are classes such as Mercenary, Biologist, Scientist, Engineer but even shorter translated names do not fit:
Thanks! For whatever reason I only thought about chinese, but languages with a non-latin alphabets (cyrillic, arabic, greek) get hit very hard by this.
E.g. teeworlds 0.7 limits the player name by 16 characters of any type/alphabet — ASCII or, say, Cyrillic or CJK. On the other hand, teeworlds 0.6 and DDNet the limit is 15 ASCII or 7 Cyrillic chars, or 5 Chinese, Japanese and Korean characters.
It doesn't limit by user-perceived character, but by code points. I guess you could consider code points an approximation to that.
I'm arguing that it's not good to limit by code points on the network level. Especially since code points aren't the same as user-perceived characters.
I'd say that the approach is good enough. On the network level we still work with bytes / 1-byte chars. The subject (skin-related) code lives in game/server/gamecontext.cpp and game/client/skin.cpp which are fairly close to high level. The client name and clan limitation is in CServer setters (server.cpp#L276-L292) and CGameClient (on a player info received in gameclient.cpp#L856-L864).
It doesn't limit by user-perceived character, but by code points. I guess you could consider code points an approximation to that.
It is clear for me that this approach is a trade-off between correctness, usefulness, and complexity. Maybe it is incorrect to apply display string limitation there but it is easier to do this check once on the network message received than on the string used/displayed. Strictly speaking the approach won't allow 16 characters of any type (e.g. composed emoji) but on practice "4 bytes per character" means sufficiently enough and sufficiently similar limitation.
It would be much better (for the players / end-users) if we'd use this approach than do nothing and keep status quo.
Hi! I'm experimenting with 0.7 skins system piped throught ddnet (0.6) protocol (commit). I tried to address some of the issues though of course (you know) there are more (missing RenderTeeHand7, quite a lot of issues with 'marking' body part, the settings Tee7 initialization, etc).
Checklist
Current
pr_07_client
(the number of skins is taken from 0.6 skins, and the data is missing)With files but before color fixed
After