Closed DarkArc closed 11 months ago
MSVC produced IFCs currently can contain interior null characters in represented strings.
That looks like an MSVC bug. Do you have a testcase for this?
Do you have a testcase for this?
@DarkArc - ping.
Is the suggestion that the IFC spec should ban a string literal from an input C++ source code like "foo\0bar"
?
I'm going to close this for now. I think this is actually an EDG bug (improper conversion of IFC wide character literals).
I remember realizing (likely in the time that's passed since this was reported) that I'd misunderstood 3.1 Index type:
The representation of identifiers referenced by TextOffset indices uses UTF-8 encoding and are NUL-terminated. For string literals, see §10.1.
as specifying that all const.str
entries were normalized to a UTF-8 encoding. As has become obvious as the implementation has matured, that's not correct.
I've opened https://github.com/microsoft/ifc-spec/issues/141 for further discussion.
MSVC produced IFCs currently can contain interior null characters in represented strings.
These null characters can be a source of bugs as implementations may not recognize the need to remove them when reading text from the IFC.
Additionally, their presence cause a repeated performance burden on the consumer as the consumer must always traverse the entire string, and discard null characters. If an implementation requires a copy of the text for its purposes, optimized copies (via
memcpy
) are made inaccessible via the presence of null characters (assuming the implementation does not handle interior null charters itself). Similarly, interior null characters unnecessarily inflate the size of the IFC file.It would be ideal if the IFC documented the presence of a terminating null character (i.e., #79) and (both for easy of implementation in consumers, and performance) forbids interior null character.