Closed wuweiran closed 5 months ago
The NativeArrayInfo
attribute binds the two parameters together. Perhaps the SAL annotation is incorrect, or perhaps its interpretation is incorrect.
I don't know anything about this function so I can't be sure, but I can transfer to the Win32 metadata repo for consideration.
This looks like a genuine code-gen bug, where the [Optional]
attribute of a parameter is retroactively applied to a non-optional parameter referenced in SAL annotations.
This is the (abridged) C definition:
HRESULT WINAPI ScriptStringAnalyse(
// ...
int cString, //In Length in characters (Must be at least 1)
// ...
_In_reads_opt_(cString) const int *piDx, //In Requested logical dx array
// ...
);
The corresponding metadata accurately captures the signature information:
HRESULT ScriptStringAnalyse(
// ...
[In] int cString,
// ...
[Optional][In][Const][NativeArrayInfo(CountParamIndex = 2)] int* piDx,
// ...
);
The generator uses this data in preparation for the "ptr + length" -> "slice" transformation but fails to acknowledge that while the ptr is optional, the length is not.
The [NativeArrayInfo]
attribute is initially evaluated here:
While I understand that the current Rust signature for ScriptStringAnalyse()
leaves the API partially unusable, I'm struggling to come up with a workable solution to address this issue. At the very least, the code generator needs to inhibit the "ptr + length" -> "slice" transformation if either parameter is [Optional]
.
This could be controlled from metadata with the [PreserveSig = true]
attribute with effects on all downstream clients (including this repo). The alternative would be to deal with this in the code generator. I don't know how complex that would be, or even just which of the four [Optional]
combinations are "transformation safe".
I think some of the confusion can be found here:
_Check_return_ HRESULT WINAPI ScriptStringAnalyse(
HDC hdc, //In Device context (required)
const void *pString, //In String in 8 or 16 bit characters
int cString, //In Length in characters (Must be at least 1)
int cGlyphs, //In Required glyph buffer size (default cString*1.5 + 16)
int iCharset, //In Charset if an ANSI string, -1 for a Unicode string
DWORD dwFlags, //In Analysis required
int iReqWidth, //In Required width for fit and/or clip
_In_reads_opt_(1) SCRIPT_CONTROL *psControl, //In Analysis control (optional)
_In_reads_opt_(1) SCRIPT_STATE *psState, //In Analysis initial state (optional)
_In_reads_opt_(cString) const int *piDx, //In Requested logical dx array
_In_reads_opt_(1) SCRIPT_TABDEF *pTabdef, //In Tab positions (optional)
const BYTE *pbInClass, //In Legacy GetCharacterPlacement character classifications (deprecated)
_Outptr_result_buffer_(1) SCRIPT_STRING_ANALYSIS *pssa); //Out Analysis of string
There appears to be a relationship between pString
and cString
in the comments, but SAL only indicates a relationship between cString
and piDx
, so that's the one that windows-bindgen
(the Rust code generator) honors since it doesn't read comments...
There are other APIs where multiple parameters point to the same length parameter and in such cases windows-bindgen
will reject the relationship and thus omit the transformation.
Thanks for the thorough analysis, Kenny. That makes sense.
The actual fix would then be to update the SAL annotations for ScriptStringAnalyse()
, and windows-bindgen
will do the right thing as the changes eventually trickle down.
Specifically, pString
needs to be associated with cString
. This is somewhat complicated as cString
is measured in characters, but pString
can be either a narrow or wide character string (depending on the value of iCharset
). Plus, pString
is typed as a void*
, so _In_reads_(s)
does nothing as there is no type information to scale s
.
I played around with the annotations a bit longer and came up with this:
_In_reads_bytes_(((iCharset == -1) + 1) * cString) const void *pString
// ^^^^^^^^^^^^^^^^^^^^^^ 2 if iCharset == -1 (Unicode)
// 1 otherwise (ANSI)
This reliably reports pre-condition violations during code analysis of a C program. While (presumably) correct, I don't know how well ClangSharp or the metadata artifacts can handle complex SAL annotations.
Metadata has no SAL support to speak of, so we'd be manually adding any attributes needed here.
Can you share specifically what attribute we should add to get the desired behavior? Or an example of another API that declares a similar relationship properly already that I should copy?
@mikebattista As user, I think just removing the relationship between piDx
and cString
should do. @riverar @kennykerr could you answer Mike's questions? Thanks.
piDx
is tied to the number of logical characters (cString
) passed in. The problem seems to lie, as @tim-weis originally pointed out, in the Rust code gen not accounting for this scenario (required length, optional input with NativeArrayInfo
pointing back).
I think pString
and its quirkiness is a distraction and not relevant to this issue.
Don't think there's any metadata work to be done here.
The problem seems to lie, as @tim-weis originally pointed out, in the Rust code gen not accounting for this scenario (required length, optional input with
NativeArrayInfo
pointing back).
@riverar I don't think this is correct. The core issue is ultimately down to the fact that the SDK provides partial annotations, failing to describe the relationship between pString
and cString
. The net result is that the metadata winds up with partial constraints that allow for a situation that cannot happen: A mandatory cString
parameter whose sole purpose is to describe the size of the optional piDx
parameter.
Had the metadata captured the full set of relations (specifically the fact that cString
also describes the size of pString
) the Rust code gen would already be prepared to handle this situation.
Don't think there's any metadata work to be done here.
I will object. As I understand the issue it can only be resolved from the metadata side of the fence. The metadata either must provide the full set of relationships between parameters, or none at all. Since the cString
<-> pString
relationship is too complex to encode, I recommend special-casing this API to drop the cString
<-> piDx
relationship (NativeArrayInfo
) altogether.
Tim is right. All the relationships must be defined, or none of them. I prefer none in this case since multiple relationships don't work in practice. But the current metadata is broken.
I think it's important to remember that SAL annotations are completely optional. The API author has the discretion to decide which relationships to express and test.
I concur that the metadata ends up with a constraint permutation where a mandatory cString
parameter is used to describe the size of an optional piDx
parameter. However, I don’t agree that this is a problem with the metadata.
SAL annotations can be implemented to varying degrees of completion, and it feels unfair to hold metadata to a standard that requires SAL annotations to be all-encompassing. (Metadata can't even represent all SAL annotated relationships, so should always be assumed to be in a partial state.)
The cString
parameter also describes the length of piDx
, which is currently modeled in the metadata. This means there is a complete and valid relationship modeled: when piDx
is provided, it must be of cString
length.
In theory, an alternative language code generator could generate an API wrapper with asserts, instead of attempting to transform the signature, and make use of this metadata as-is with no problem.
That said, I’m more than happy to hold the dissenting opinion here and remove the metadata as requested.
I don't disagree that SAL is incomplete and optional. The trouble is that the metadata attempts to be more than SAL. If languages are to alter code generation, not merely check for SAL violations at call sites, then the metadata must be authoritative and complete to be useful.
SAL annotations can be implemented to varying degrees of completion
Correct. When used for what they were intended for: Static code analysis.
It feels unfair to hold metadata to a standard that requires SAL annotations to be all-encompassing.
They must only be all-encompassing when re-purposed to inform code generation. Which is what this repo's artifacts are. In this case, the annotations aren't sufficiently complete, and completing them wouldn't be representable (at this time).
The only reasonable choice thus is to drop them all.
[...] the annotations aren't sufficiently complete [...]
For Rust, and only Rust. 😛 (Not accurate, C# is also affected.)
I'll get a PR out in a moment to unblock Rust.
Summary
windows::Win32::Globalization::ScriptStringAnalyse
puts the third parameter of its corresponding ffi ScriptStringAnalyse aspidx.as_deref().map_or(0, |slice| slice.len().try_into().unwrap())
. This prevents user from settingcString
(length of the string to analyze) without specifyingpiDx
which is optional. Please report this for me if it is caused by incorrect Win32 metadata.Crate manifest
Crate code