Open psychocoderHPC opened 5 months ago
Interesting.
Is it safer than the current approach? How does it work when e.g. two header files, each with a shared memory definition, are included in different orders ?
This approach is at least compatible too the current interface. This you could still be explicit with your ID.
How does it work when e.g. two header files, each with a shared memory definition, are included in different orders?
This would be something that must be tested. I do not know the optimizations for device linked code.
I created ID's in a shared library within a function foo and in the main CPU. Linked it together and the IDs are unique.
In general it should be because each lambda [](){}
should get a anique type.
Is it safer than the current approach?
The user does not need to fiddle around with the counter makro and the interface will become nice because there is no need to explicitly set the ID.
The user does not need to fiddle around with the counter makro and the interface will become nice because there is no need to explicitly set the ID.
Yes, I completely agree with this.
My concern is a case where
A
defines a first device function that declares a shared memory blockB
defines a second device function that declares a shared memory blockC
includes A
and B
and defines a device kernel that uses those functionsD
includes B
and C
Now, D
includes B
before indirectly including A
, so the __COUNTER__
have different values in C
and D
, which can cause ODR violations or other problems with the shared memory declarations.
Does this approach based on unique lambda addresses make things more robust ?
D
includes B
and A
(in the opposite order), then C
, then defines the kernel
Currently we need a unique id to create in kernel static shared memory:
With c++20 we could auto generate this id:
test it live: https://godbolt.org/z/MrhbvncET
will be
This based on the id generator I saw in this talk https://www.youtube.com/watch?v=lPfA4SFojao