shader-slang / slang

Making it easier to work with shaders
MIT License
1.78k stars 159 forks source link

Nested initialization list in struct crash fix #4491

Closed ArielG-NV closed 4 days ago

ArielG-NV commented 6 days ago

fixes: #4489

Problem:

  1. Previously, TopStruct var = {} would crash if a nested member also had a initExpr assigned to {}. This crash is due to a particular ordering of how {} resolves its own type blindly setting it as an invalid type as a place holder.

Solution:

  1. only set the init list type when invalid (if we checkTerm twice on the same object, this won't break the code)

Note: removed the overhaul to default initialization since it breaks Falcor and likely needs more time dedicated to it, the code history will be kept in this branch (in the mean-time it is best to fix the current regression with a more simple fix).

ArielG-NV commented 5 days ago

Why do we double check the expr in the first place?

This is not isolated behavior, If we have two AssignExpr that share the same ->right member, we will run checkTerm on the ->right member twice. This only breaks visitInitializerListExpr because visitInitializerListExpr is dependent on being called only once.

Why this crash happens now?

The behavior is showing up now due to minor ordering changes in AST building. Previous Logic:

  1. Slang did not ensureDecl(memberVariable, DeclCheckState::DefinitionChecked) when checking a structDecl, we only auto generate __init for structDecl.
  2. Naturally ensureDecl() on the auto generated __init for structDecl. This causes a check for AssignExpr (contains initExpr). This assigns place-holder invalid type to initExpr from call to visitInitializerListExpr.
  3. Slang naturally runs ensureDecl(memberVariable, DeclCheckState::DefinitionChecked).
  4. Now we call checkVarDeclCommon to check the memberVariable. This runs CheckTerm() on our initExpr as well. This re-assigns place-holder invalid type from call to visitInitializerListExpr. We now run coerce to change the initExpr type to something valid.
  5. Non-invalid type for initExpr works fine.

New Logic:

  1. Slang calls ensureDecl(memberVariable, DeclCheckState::DefinitionChecked) when checking a structDecl.
  2. Now we call checkVarDeclCommon to check the memberVariable. This runs CheckTerm() on our initExpr. This assigns place-holder invalid type from call to visitInitializerListExpr. We now run coerce to change the initExpr type to something valid.
  3. we auto generate __init for structDecl now.
  4. Naturally ensureDecl() on the auto generated __init for structDecl. This causes a check for AssignExpr (contains initExpr). This re-assigns place-holder invalid type to initExpr from call to visitInitializerListExpr.
  5. Invalid type for initExpr causes crash.

We were double checking the initExpr before. Code working was pure chance on how Slang was parsing structDecls.

Can we stop doing that?

The solution is to not use initExpr directly ever (clone the AST node before using) since we will only have 1 instance of each initExpr in that case. This is a hack though, visitInitializerListExpr is fundamentally unstable with Slang AST.

csyonghe commented 5 days ago

Shouldn't we exit immediately if we can determine the expr is already checked?

ArielG-NV commented 5 days ago

Shouldn't we exit immediately if we can determine the expr is already checked?

I don't know if this would be a problem/source-of-bugs. One problem I can think of is that we may modify the initializer list arg list and then CheckTerm a second time)?

Other AST nodes allow rechecking the same expr.

csyonghe commented 5 days ago

An expr should never need to be checked twice. Once it is checked it should be in its final form and not be modified elsewhere.

kaizhangNV commented 4 days ago

@ArielG-NV Does the gfx-smoke test pass? If yes, can you enable the gfx-smoke test and re-run all the checks?

ArielG-NV commented 4 days ago

Does the gfx-smoke test pass? If yes, can you enable the gfx-smoke test and re-run all the checks?

It should pass

kaizhangNV commented 4 days ago

ok, can you please enable the test in this PR?

ArielG-NV commented 4 days ago

ok, can you please enable the test in this PR?

we never enabled gfx-smoke since gfx has problems with CI, here should be an issue open for this already.

kaizhangNV commented 4 days ago

ok, can you please enable the test in this PR?

we never enabled gfx-smoke since gfx has problems with CI, here should be an issue open for this already.

Do you know what is the issue? And do we have issue tracking this?

ArielG-NV commented 4 days ago

Do you know what is the issue? And do we have issue tracking this?

problem is listed under https://github.com/shader-slang/slang/issues/3176