oneapi-src / level-zero

oneAPI Level Zero Specification Headers and Loader
https://spec.oneapi.com/versions/latest/elements/l0/source/index.html
MIT License
208 stars 90 forks source link

Sub devices and Sysman API #76

Open jpeyton52 opened 2 years ago

jpeyton52 commented 2 years ago

Maybe just a clarification is needed. But I see this in the specification:

https://spec.oneapi.io/level-zero/latest/core/PROG.html#sub-device-support

a sub-device can be partitioned into more sub-devices; e.g. down to a single slice.

Let's say I wanted to enumerate all the sub-devices on a device, does this mean I need to recursively call zeDeviceGetSubDevices until I get zero sub-devices or does the root device enumerate all possible sub-devices directly with a single call?

I ask because the Sysman API notes that

A Sysman device handle operates at the device level. If a sub-device device handle is passed to any of the Sysman functions, the result will be as if the device handle was used.

If there are sub-devices strictly under sub-devices, then what would a Sysman call with argument device.sub_device.sub_device refer to? device or device.sub_device?

jandres742 commented 2 years ago

@jpeyton52

does this mean I need to recursively call zeDeviceGetSubDevices until I get zero sub-devices or does the root device enumerate all possible sub-devices directly with a single call?

the former. This is intended to be scalable and to be able to expose any kind of topology the device has.

For sysman, correct, the device level referred there is the top root (physical) device, not the parent device in the hierarchy. So no matter which subdevice handle you pass, the device handle mentioned here

A Sysman device handle operates at the device level. If a sub-device device handle is passed to any of the Sysman functions, the result will be as if the device handle was used.

is the root device.

jpeyton52 commented 2 years ago

Thanks! Knowing that, let the following be some possible device hierarchy:

First question: Wow are all the sub-sub-devices (??) numbered? Would they be sub-devices root.0.0, root.0.1, root.1.0, and root.1.1? or would they be unique. i.e., sub device Ids would range from 0-5 for all 6 sub-devices.

When enumerating the memory modules using Sysman, you would get four memory modules and each one would have a single sub-device Id associated with it (https://spec.oneapi.io/level-zero/latest/sysman/api.html#_CPPv420zes_mem_properties_t). Would the sub-device Id be restricted to 0 & 1 (only the direct child sub-devices of the root device) or would it know where down the hierarchy of the memory module's exact placement (which to me implies the sub-device Ids would have to be unique).

jandres742 commented 2 years ago

Wow are all the sub-sub-devices (??) numbered? Would they be sub-devices root.0.0, root.0.1, root.1.0, and root.1.1? or would they be unique. i.e., sub device Ids would range from 0-5 for all 6 sub-devices.

specification leaves some room for that, as it does not completely say how they should be enumerated. L0 driver implementation currently enumerates each subdevice on its level, like root.0.0, root.0.1, root.1.0, and root.1.1

When enumerating the memory modules using Sysman, you would get four memory modules and each one would have a single sub-device Id associated with it (https://spec.oneapi.io/level-zero/latest/sysman/api.html#_CPPv420zes_mem_properties_t). Would the sub-device Id be restricted to 0 & 1 (only the direct child sub-devices of the root device) or would it know where down the hierarchy of the memory module's exact placement (which to me implies the sub-device Ids would have to be unique).

That would be implementation specific. it would come down to what a sub-sub-device is in that device, and how that memory is associated to it. For instance, you could have a device one which the sub-sub-devices have each actually independent memory, on which case, the sub-device id would refer to the direct level, but if the memory is not independent for each sub-sub-device, but for all the parent sub-device (so not for root.0.0 but for root.0) then the sub-device id would be the sub-device parent.

ze_bool_t onSubdevice [out] True if this resource is located on a sub-device; false means that the resource is on the device of the calling Sysman handle uint32_t subdeviceId [out] If onSubdevice is true, this gives the ID of the sub-device

so basically, it would depend on how the implementation the device defines "location" for that resource.

eero-t commented 10 months ago

Note that how L0 handles subdevice hierarchies has changed significantly: https://spec.oneapi.io/level-zero/latest/core/PROG.html#device-hierarchy