Closed darksylinc closed 1 month ago
The existing language for structure members says they are char arrays of this size and contain a null-terminated string. If that's not clear, perhaps a NOTE somewhere in the fundamentals chapter? There are a couple of different max lengths of this nature. N.b. the null terminator goes at the end of the string which typically will be sooner than the end of the array since there are no extension names anywhere near that long (and ideally, never will be).
I too thought "it was obvious", but I think it's worth being more explicit (and giving a few examples) in this particular case, considering that bugs surrounding null terminators are the source of lots of CVEs.
@darksylinc #2365 should address this - the Vulkan WG has signed off but it would be good for you to review as well.
Hi!
Looks good to me!
The only remark I can make is that it mentions about UTF8 strings but makes no mention if it is normalized or not. It's probably unnecessary and should not be clarified until someone actually raises this point with an actual problem being faced.
Besides, most strings are in the ASCII range in practice (notably exceptions are the marketing names in device name).
OK I just read something that left me thinking.
Perhaps the spec should recommend, but not require, that utf8 strings should be canonized aka normalized, to be in harmony with other public specs which do the same.
Is there any actual drawback in practice other than not being able to do a byte-for-byte comparison of normalized and nonnormalized strings that are equivalent?
Usually that's the reason: byte-for-byte comparisons. This is a weak argument though, because nothing stops a vendor from writing 【My GPU Name】
instead [My GPU Name]
. Thus if the app is looking for the latter, string comparison will fail even when both strings are normalized.
However beyond that, UTF8 parsing can be difficult and there have been security vulnerabilities and crashes in the past due to bugs when parsing invalid or non-canonical utf8.
Certain routines are optimized for iteration if you can assume they are in their canonical form and is UB if the input string is not in that form (however that would mean the spec must require canonical forms, and include validation tests for every string field to ensure they are all normalized). However it's bad practice to use such routines on strings you don't control (like an external API such as Vulkan).
I'm only saying this to point out potential edge cases. Personally it's probably best not to worry too much about it unless someone encounters a specific problem.
I'm only saying this to point out potential edge cases. Personally it's probably best not to worry too much about it unless someone encounters a specific problem.
Agreed. Going ahead with #2365 as it stands.
I've seen the following code from Google:
I was under the impression that the null terminator goes in
ext_name[VK_MAX_EXTENSION_NAME_SIZE - 1]
rather thanext_name[VK_MAX_EXTENSION_NAME_SIZE]
.I tried looking into the spec but I couldn't find how C strings are handled. Obviously Google's approach may be prudent and doesn't hurt, even if the null terminator is in
ext_name[VK_MAX_EXTENSION_NAME_SIZE - 1]
.I think it would be wise to clarify whether the null terminator goes in
ext_name[VK_MAX_EXTENSION_NAME_SIZE - 1]
orext_name[VK_MAX_EXTENSION_NAME_SIZE]
; specially to avoid surprises in the future.