Closed jkwak-work closed 1 month ago
The regression happened because the first generic argument of the GLSL texture functions were changed from __builtinArithmaticType
to __builtinFloatingPointType
.
We will need to either bring back to __builtinArithmeticType
or add more functions that take __BuiltinIntegerType
.
See the attached interface hierarchy for more information.
I think that this should be WNF until we have a mechanism to handle the capability better.
The texture sampling functions was working for IArithmetic
interface, but it was changed to IFloat
interface because HLSL doesn't support the integer type textures.
I can think of two approaches to support the integer typed textures for GLSL and Metal while not supporting it for HLSL.
__BuiltinFloatingPointType
and __BuiltinIntegerType
. The functions with __BuiltinIntegerType
will not have the hlsl
keyword in the require
attribute.IArithmetic
for the sampling functions, and emit an invalid intrinsic when the target is HLSL and the texture format is an integer type.I am going to show examples to clarify what each approach means.
The following example is what we currently have for Sample()
function.
__generic<T:IFloat, Shape: __ITextureShape, let isArray:int, let isMS:int, let sampleCount:int, let isShadow:int, let format:int>
extension __TextureImpl<T,Shape,isArray,isMS,sampleCount,0,isShadow,1,format>
{
[require(cpp_cuda_glsl_hlsl_metal_spirv, texture_sm_4_1_fragment)]
T Sample(vector<float, Shape.dimensions+isArray> location)
{
....implementation...
}
}
Note that T
is constrained to IFloat
and the function, Sample()
, requires hlsl
.
For the first approach, the code will change to something like following,
${{{{
for (int isIntegerType = 0; isIntegerType < 2; isIntegerType++)
{
const char* samplerType;
const char* mayRequire_hlsl;
if (isIntegerType)
{
samplerType = "__BuiltinIntegerType";
mayRequire_hlsl = "";
}
else
{
samplerType = "__BuiltinFloatingPointType";
mayRequire_hlsl = "hlsl_";
}
}}}}
__generic<T:$(samplerType), Shape: __ITextureShape, let isArray:int, let isMS:int, let sampleCount:int, let isShadow:int, let format:int>
extension __TextureImpl<T,Shape,isArray,isMS,sampleCount,0,isShadow,1,format>
{
[require(cpp_cuda_glsl_$(mayRequire_hlsl)metal_spirv, texture_sm_4_1_fragment)]
T Sample(vector<float, Shape.dimensions+isArray> location)
{
....implementation...
}
}
Note that the code will be duplicated with for-loop
.
The first iteration will emit T
constrained to __BuiltinFloatingPointType
.
And the second iteration will emit T
constrained to __BuiltinIntegerType
.
The capability keyword hlsl
will be applied only to the first that is constrained to __BuiltintFloatingPointType
.
For the second approach, the code will change to something like following,
__generic<T:IArithmetic, Shape: __ITextureShape, let isArray:int, let isMS:int, let sampleCount:int, let isShadow:int, let format:int>
extension __TextureImpl<T,Shape,isArray,isMS,sampleCount,0,isShadow,1,format>
{
[require(cpp_cuda_glsl_hlsl_metal_spirv, texture_sm_4_1_fragment)]
T Sample(vector<float, Shape.dimensions+isArray> location)
{
__target_switch
{
case hlsl:
if (T is __BuiltinIntegerType) __intrinsic_asm "invalid intrinsic";
}
....implementation...
}
}
Note that T
is constrained to IArithmetic
; not IFloat
.
The body of the function, Sample()
, has __target_switch
where hlsl
case will emit "invalid intrinsic".
The second approach looks more ideal because the code is simpler and duplicates less.
I think we should go with the second approach, because the first approach will have a big negative impact on the user code side.
With the first approach, implementing for IArithmetic
is not the same as implementing for both __BuiltinIntegerType
and __BuiltinFloatingPointType
.
The user side code will have to duplicate their implementation for both __BuiltinIntegerType
and __BuiltinFloatingPointType
, because they cannot generalize their implementation with IArithmetic
.
A downside with the second approach is that emitting an erroneous string, "invalid intrinsic", is not the same as erroring out with our capability system. In fact, we want to avoid this type of error handling that causes an error at the downstream compiler.
Note that the capability checking happens before the evaluation of T is __BuiltinIntegerType
.
The type of T
is unknown until we specialize the function, which happens at the backend.
Due to the limitation, this type of problems cannot be handled by the capability system even though the nature of the problem falls into the category of the capability.
In order to properly error it out, we will need a new feature like static_assert
in C++ that can trigger a compile time error.
If we have such a feature, the code can be re-written as following,
case hlsl:
static_assert(T is __BuiltinFloatingPointType, "HLSL doesn't support sampling from an integer type texture.");
And the handling of such static_assert
will have to happen during the emitting step where __target_switch is applied and the codes for the other targets are not visible.
With the explanation, I think we should defer the integer type texture support until we have static_assert
and mark this issue as WNF.
I like to hear a feedback from @csyonghe .
I realized that there is still I can do for this issue without a new feature like static_assert
.
My comment above assumes that the use case is from slang shader source code to spir-v/DXIL.
But if the use case is from GLSL to SPIR-V, I can still make it working with the integer textures.
I will work on it and close the issue when it is done.
I tried to make some changes to glsl.meta.slang with keeping hlsl.meta.slang as it is. But I realized that I can no longer call the HLSL functions from GLSL functions because GLSL functions supports integer types whereas HLSL functions no longer support them. In order to solve the problem, I would have to duplicate a lot of lines from hlsl.meta.slang to glsl.meta.slang. I don't think this is a right approach and it will be a throwaway work at best.
I think a right approach is to use static_assert
as I suggested and make all texture sampling functions to support IArithmetic
interface.
I have a proof-of-concept version of implementation for this and I like to get some feedbacks before move on further.
https://github.com/shader-slang/slang/pull/4294
I tried to extend vector<__BuiltinArithmeticType,N>
to implement IArithmetic
.
__generic<T:__BuiltinArithmeticType, let N : int>
extension vector<T,N> : IArithmetic
{
...implement the interface...
}
The idea is similar to how vector<__BuiltinFloatingPointType,N>
implements IFloat
.
But this attempt failed, and it made a few tests failing.
If it worked, I was going to replace IFloat
used for the texture sampling functions with IArithmetic
.
I am not clear on why but my guess is that there is an overlapping part between vector<__BuiltinArithmeticType,N>
and vector<__BuiltinFloatingPointType,N>
.
If I have vector<float,N>
, as an example, it wouldn't know if it should be vector<__BuiltinArithmeticType,N>
or vector<__BuiltinFloatingPointType,N>
.
The failing tests are for autodiff tests, and the error message implies that float4
didn't implement IDifferentiable
.
IDifferentiable
is implemented for IFloat
but not IArithmetic
.
So somehow vector<__BuiltinArithmeticType,N>
is picked where vector<__BuiltinFloatingPointType,N>
was supposed to be.
GLSL spec requires all texture functions to take
gsamplerXXX
types as the first input argument whereg
indicatesfloat
,int
anduint
. It says,According to the section 8.9 in the GLSL document, the following functions take
gsamplerXXX
as the first argument except for the shadow/depth samplers.This is a regression from one of recent changes. https://github.com/shader-slang/slang/pull/3963 And it causes out nightly CTS test to fail.