jzm-intel commented 1 year ago

(Edited: bold the proposal)

Proposal

Currently WebGPU Spec use USVString and DOMSrting in several places (listed below) including user input.

For both type a string with a null character '\0' (i.e. Unicode code point U+0000) in it is valid, e.g. "abc\0xyz" would be valid DOMString and USVString, however such kind of strings is kind of tricky in other languages like C/C++.

Currently the WGSL spec explicitly forbid the null code point (U+0000) in WGSL program. Although using a string with '\0' in it in some case like entryPoint should be a validation error (since there must not be a WGSL entry point name containing '\0'), it may be generally better and easier to forbid any '\0' in all string user providing to WebGPU. It would be coherent to report a JavaScript TypeError whenever any user-provided string was found to contain a '\0'.

I am not quite sure how to introduce such rule into WebGPU spec, considering USVString and DomString are used multiple times in different places within the spec. Maybe add a small sub-chapter under Chapter 3. Fundamentals?

Current usage

Currently USVString and DomString are used in WebGPU spec for:

USVString
- label in GPUObjectBase
- code and hints in GPUShaderModuleDescriptor
- entryPoint and constants in GPUProgrammableStage
- groupLabel for pushDebugGroup and markerLabel for insertDebugMarker in GPUDebugCommandsMixin
DOMString
- Some read-only attributes
- parameter unmaskHints of requestAdapterInfo
- requiredLimits in GPUDeviceDescriptor
- message of constructors for different GPUError subtypes, e.g. GPUValidationError
- type of constructors for GPUUncapturedErrorEvent

greggman commented 1 year ago

I thought about suggesting that when I first tried \0 in strings. What I don't like about it, at least for label is it turns label into a slower operation as every label set would require scanning the label for \0. Given that what happens on the backend with a label is implementation defined, it seems fine to allow \0, just with maybe a note to implementations to be aware of the issue.

For code it's already a rule in WGSL so that one seems covered.

toji commented 1 year ago

This was discussed a bit previously in 2021 in the context of WGSL code: https://github.com/gpuweb/gpuweb/issues/1816

The comment on that issues says that after discussion "there is no particular desire to disallow null characters for the sake of allowing implementations to use null-terminated strings." but the corresponding meeting minutes are pretty brief and don't dive too much into what drove that conclusion.

kainino0x commented 1 year ago

Discussed a bit with @toji. We were curious, @jzm-intel, was there a particular issue that caused you to propose this?

Allowing \0 in strings is well established throughout the web specs and the web engine codebases (Blink/WebKit/Gecko), so the default answer is certainly to allow them. For anything that doesn't pass into native implementations (Dawn/wgpu/etc) we should have no trouble implementing null characters. That includes at least error constructors, and the wrapper-local state for label.

I also agree with Gregg that since labels are all used in implementation-defined ways, label, groupLabel, and markerLabel don't really need validation. We can just normalize or trim them internally as we need.

For the rest, I think it's most congruous with the rest of the platform to simply handle \0s naturally:

WGSL code and identifiers (code, hints, entryPoint, constants) will necessarily fail anyway since \0 isn't valid in WGSL.
requiredLimits already rejects with an OperationError if you specify an unrecognized limit name.
unmaskHints, afaik, simply ignores any unrecognized keys, so even if we're translating them down to the native layer, we can just skip (and optionally warn) on ones with \0.

In summary, I don't think we should change anything, unless we have gaps in the current spec that need fixing.

jzm-intel commented 1 year ago

was there a particular issue that caused you to propose this?

Yes in particular I try using \0 in the entryPoint string, which is indeed pass into native implementations, and add this case in the CTS (e.g. using entryPoint name main\0a in pipeline desc while in program the fn name is main). Following the current spec, it should be a validation error. But in current Chrome implementation, the well-defined UTF-8 string are somehow passed to native implementations (Dawn in this case) as null-terminated string without proper validation, making the CTS failed. So I propose this to see if we could have a unified path to get rid of \0 everywhere.

I agree that we can handle each use case under the current spec.

kainino0x commented 1 year ago

Discussed in today's meeting. If we understood correctly there are no major remaining concerns with implementing the specified semantics, so we think no API change is needed.

greggman commented 1 year ago

WGSL code and identifiers (code, hints, entryPoint, constants) will necessarily fail anyway since \0 isn't valid in WGSL.

Both Firefox and Chrome currently seem to incorrectly match an entryPoint with \0 in it to a shader that does not. In other words

// shader

...

@vertex
fn myVSMain(v: MyVSInput) -> MyVSOutput {
...

// JavaScript

const pipeline = device.createRenderPipeline({
  vertex: {
    entryPoint: 'myVSMain\0foo',

This code incorrectly? matches entry point myVSMain even though the entry point specified was myVSMain\0foo

Should that be an error like 'no entry point named myVSMain\0foo' or is the fact that it matched acceptable?

jzm-intel commented 1 year ago

Should that be an error like 'no entry point named myVSMain\0foo' or is the fact that it matched acceptable?

According to current spec (treated as USVString, which is not null-terminated), and with my common sense, I thought this should be a error.

Kangz commented 1 year ago

Yes these are bugs in Chromium and Firefox. We have an open issue for it in Dawn.

kdashg commented 1 year ago

GPU Web meeting 2022-11-16 non-APAC-timed

* KN: we can revisit if Intel has comments, but happy to resolve for now. * KG: if it's difficult implementation-wise, we can discuss.

greggman commented 1 year ago

I would like to reopen this because is a breaking change for all any apps using the C API (so for example the python integration, emscripten, .net, go, etc...) and, AFIACT it's non-trival to implement in Dawn/Tint (I could be wrong about this)

The biggest issue is \0 is not allowed, so any work required is work all so we can just say "No! You did the wrong thing". We arguably have better things to do than subject so many projects to a breaking change and ourselves to work all so we can just tell people from the C API level that passing \0 in a string is bad.

AFAICT there are 4 issues

Labels
WGSL code
WGSL comments
entryPoints

For (1) labels it's irrelevant. The spec says what happens to labels is undefined. So if the user passed foo\0bar the implementation is allowed to print whatever if wants foo, no-label-for-you, nothing, 12345 whatever.

For (2) WGSL disallows \0

For (3) It's not clear if \0 is allowed but I tried passing \0 in a comment to Tint and it failed so engineering work would be required here

For (4) because \0 is disallowed in WGSL then passing \0 here is always an error

Because of all these points, changing the C API to take sizes just seems like a huge waste of everyone's time. Not just webgpu implementors but also existing webgpu native users. Across all projects a few 100 man hours will be spent all so we can add a code which doesn't actually add any features, just prints an error.

Two alternative solutions I'd like to suggest

Solution 1: for WGSL and entryPoint, throw if \0 is passed in.

I know that WebGPU generally returns errors asynchronously but there are several it doesn't so for example
```
device.createShaderModule();   // throws, expected 1 argument got 0
device.createShaderModule(""); // throws, type not GPUShaderModuleDescriptor
device.createShaderModule({}); // throws, type not GPUShaderModuleDescriptor
```
would it really be so bad if the API was defined as not accepting \0 for code and entryPoint and it threw here? No user is going to intentionally add \0 to their code because as pointed out a above it's an error.

Solution 2: keep the C API the same, workaround it in the browser

The C API already doesn't allow \0 since it only accepts \0 terminated strings. So, the JavaScript/Browser side and scan the string and effectively synthesize an asynchronous WebGPU error. It would do this

createShaderModule(descriptor) {
 if (descriptor.code.includes('\0')) {
    device.pushErrorScope();
    const invalidShaderModule = device.createShaderModuleImpl(descriptorThatFails);
    device.popErrorScope().then(() => {
       insertSyntheticError("`\0` not allowed in WGSL");
    });
    return invalidShaderModule;
 } else {
    return device.createShaderModuleImpl(descriptor);
 }

You can do something similar for the create pipline functions.

I'd strongly prefer solution (# 1). It means the least amount of wasting everyone's time with a change that adds nothing. (# 2) is a compromise. It doesn't waste other's time but it still wastes ours.

greggman commented 1 year ago

Looking forward ot hear Kai's ideas

I feel based on the comments from the meeting my current proposals is:

Leave the spec as is, leave webgpu.h as is, and, deal with \0 in WGSL and entryPoint in the browser, synthesizing a normal WebGPU type error.

kainino0x commented 1 year ago

For (3) It's not clear if \0 is allowed but I tried passing \0 in a comment to Tint and it failed so engineering work would be required here

WGSL doesn't allow null anywhere in the program source, including comments AFAIU: https://gpuweb.github.io/gpuweb/wgsl/#textual-structure

kainino0x commented 1 year ago

I don't recall where, but at some point I think someone noted that identifiers have lots of other invalid characters too. We could easily for example replace NUL characters at the JS API with \ 0 or � or whatever at the C API, it wouldn't affect whether there's an error, and the error message would still be clear to human readers. No reverse translation (C->JS) is needed.

This doesn't do anything to solve NULs in GPUShaderModuleDescriptor.code, but it at least constrains the problem to just one place, and fixes most of them (the various forms of GPUProgrammableStage.entryPoint/constants, and GPUShaderModuleDescriptor.hints).

kainino0x commented 1 year ago

That is to say - I think we can keep the C API down to just one string length "wart" without changing the JS API.

greggman commented 1 year ago

Changing the C API to take a string length is strictly worse. We'd be taking an API in which it's impossible to pass \0 (an error) and making it possible to pass the error condition for absolutely no other gain. An API where you can pass less wrong things is better than one in which you can pass more wrong things.

I think we should leave the C API as is, browsers can work around it as mentioned in https://github.com/gpuweb/gpuweb/issues/3576#issuecomment-1406926447

kainino0x commented 1 year ago

That's fine with me, pretty sure it [EDIT: solution 2] is essentially the same as the group's previous resolution.

kainino0x commented 1 year ago

Wait, sorry, do you mean solution 1 or solution 2? Both keep the C API as is.

synthesizing a normal WebGPU type error

WebGPU validation error, or TypeError exception?

greggman commented 1 year ago

I mean, keep the C API as is, synthesize a validation error (yes, solution 2)

kainino0x commented 1 year ago

the same as the group's previous resolution.

That is to say - the CG's resolution for the JS API; we didn't officially make a decision on the C API. I think that's what we initially had in mind, but then it got further litigated in the webgpu-headers repo (https://github.com/webgpu-native/webgpu-headers/issues/170). So since you're proposing no JS API change, we'll be able to close this and move the discussion back there.

Tacit resolution queue to close this again, since it matches the group's previous resolution, and (I think) the leanings in the last meeting.

gpuweb / gpuweb

WebGPU: Forbid null code point '\0' (U+0000) within strings #3576

Proposal

Current usage