Open gmlueck opened 10 months ago
Linking #451 here. sub_group
is probably different enough to be handled differently compared to item
and nd_item
, but we had the same question about it as well.
I don't think these classes are meaningful outside of their own work item, because they generally carry the semantics of "give me the position of this item in the parallel iteration space". Using objects from a different work item should be illegal and/or UB.
If one did this, even just within one implementation the behavior might not be consistent. For example, whether AdaptiveCpp stores the position in the iteration space inside nd_item
/item
or whether it queries this on demand using builtins changes depending on compilation flow and configuration.
I think this makes sense also. I was wondering why we didn't define all the member functions of nd_item
and item
as static. That would make it clear that they only return information about the calling work-item, since static member functions don't have access to any state in the object. Did we decide not to define them as static to give implementations the freedom to store information about the calling work-item in the object?
In any case, I think we could add text to the introduction for each of these objects something like:
The implementation constructs an
item
object for each work-item when a kernel is submitted with a range iteration space. That object (and copies of that object) may be used only by the work-item that "owns" it. Although it is possible to copy the object into memory that is accessible to other work-items or to the host, calls to member functions of the object from other work-items or from the host produce undefined behavior.
I think we could use similar language for sub_group
and group
also.
I think this makes sense also. I was wondering why we didn't define all the member functions of
nd_item
anditem
as static. That would make it clear that they only return information about the calling work-item, since static member functions don't have access to any state in the object. Did we decide not to define them as static to give implementations the freedom to store information about the calling work-item in the object?
I agree that freedom of implementation is important. But I think it may also be important to note that if the functions were static, a developer wouldn't need an instance of an nd_item
to call them. That would mean that developers could call nd_item
member functions outside of kernels, call an nd_item
member function using a class with the wrong number of dimensions, etc.
Discussed in WG meeting 01/25/24. Next steps: introduce a new section similar to https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:reference-semantics to group these restrictions together. Or clarify these directly. Clarification to SYCL 2020.
FYI. I committed stateless implementation of nd_item
to DPC++ compiler. This commit exposed SYCL-CTS tests relying on stateful implementation or wrong uses of nd_item
methods on the host.
T must be equality comparable on the host application (in the case where T is available on the host) and within SYCL kernel functions. Equality between two instances of T (i.e. a == b) must be true if the value of all members are equal and non-equality between two instances of T (i.e. a != b) must be true if the value of any members are not equal, unless either instance has become invalidated by a move operation. By extension of the requirements above, equality on T must guarantee to be reflexive (i.e. a == a), symmetric (i.e. a == b implies b == a and a != b implies b != a) and transitive (i.e. a == b && b == c implies c == a).
As it was agreed above sharing and comparing stateless nd_item
objects doesn't make sense and I'd like to drop this test.
nd_item
objects using more reliable technique. It saves the "state" of nd_item
object by obtaining and saving ranges/ids via calling methods like get_local_id()
. Unfortunately, it does it on the host too! https://github.com/KhronosGroup/SYCL-CTS/pull/869 updates both tests.
This question recently came up from our implementation team. What is the precise semantics of the member functions on
item
andnd_item
? Do they return the index of the calling work-item or do they return the index of the work-item that created the object?The question is relevant only in the weird case where an
item
ornd_item
object from one work-item is copied into SLM and then referenced by another work-item. This could be done, for example, by using placement-new to copy theitem
ornd_item
object.Note that
id
may have a different semantic because it is user-constructible. The semantics ofid
seem clear. The member functions return the information that was used to construct theid
object. Therefore, if work-item A copies itsid
object to SLM and then another work-item B calls its member functions, I presume work-item B would get the index of work-item A.