Closed YuchenJin closed 3 years ago
cc @kkraus14 @leofang @prigoyal @zdevito @rgommers @oleksandr-pavlyk
I think it would be feasible to add kDLGPU
as a synonym to kDLCUDA
for some time should any compatibility concerns arise.
I am in favor of renaming otherwise.
Renaming will likely bump DLPACK_VERSION
, and with adoption of DLPack in Python array API, how would a Python user (consumer library author) know what DLPack version the producer has been using?
Additionally, is there any guidelines about extended DLDeviceType
enum? Are kDLExtDev
allowed to be moved, i.e. if I wanted to add kDLSycl_LevelZero_GPU
and bunch of other values to the enum, would I do that before the guard, or after?
All of the changes so far(including this one) are ABI compatible, which means a system uses an earlier version of DLPack can be imported in a later version(because the number has not change, we are just rename the field). For the same reason, we might want to avoid change the number (e.g. DLExtDev) but instead introduce new numbers.
ABI breaking changes shall be considered more seriously, and so far we have not yet encountered them
While this isn't an ABI breaking change it will break the build of someone using kDLGPU
and kDLCPUPinned
.
We would assume the consumer of dlpack maintain a copy(or submodule hash) of a particular version. So the framework can choose to upgrade the DLPack while updating the usage of kDLGPU to kDLCUDA. Perhaps we can keeping kDLGPU, kDLCPUPinned for another release cycle as alias would resolve the source compatibility issue
I took a quick look at PyTorch and I think this line should be fixed, but it's likely irrelevant of this RFC (not sure): https://github.com/pytorch/pytorch/blob/4cb534f92ef6f5b2ec99109b0329f93a859ae831/caffe2/python/pybind_state_dlpack.cc#L11 cc: @emcastillo @asi1024 @kmaehashi for vis
By the way, can we please add a new entry kDLCUDAManaged
following this renaming (which I am supportive for clarity if no backward incompatibility happens)? It's very useful for a project I am about to start prototyping. Moreover, managed memory should not be viewed as a CPU-only or GPU-only entity, but something distinct. I am surprised it hasn't been proposed yet.
following this renaming (which I am supportive for clarity if no backward incompatibility happens)
On a second thought, kDLCPUPinned
has its own value, as both CUDA and ROCm provide pinned memory. If we are renaming this entry, I'd prefer to rename kDLCPUPinned
to kDLCUDAPinned
and add kDLROCMPinned
(following kDLROCM
).
EDIT: fix uppercase names.
I think renaming is much better than using an alias. That just drags out the work rather than reduce it, unless the intent is to keep an alias forever (which seems not desirable). And while C enum's can have duplicate values, it is normally expected for them to unique.
It seems to me like there is no real source compatibility issue, people just need to rename if they upgrade their vendored copy of dlpack.h
.
Recently DLContext
was renamed to DLDevice
, that didn't give any issues AFAIK.
To follow on @leofang 's comment, kDLCUDACPUPinned
might be less ambiguous, mainly because the memory is still allocated on the host, rather on the device. If we want a shorter name, kDLCUDAHost
might be fine since host usually refers to the host memory
If we want a shorter name,
kDLCUDAHost
might be fine since host usually refers to the host memory
I think this is fine since pinned memory is allocated via cudaMallocHost
(and likewise in ROCm/HIP).
So, let me recap what has been discussed so far:
kDLGPU
-> kDLCUDA
(rename)kDLCPUPinned
-> kDLCUDAHost
(rename) kDLCUDAManaged
(new)kDLROCMHost
(new)For now we do not add kDLROCMManaged
because there's no managed memory in ROCm/HIP, but it can be easily added once it's provided and needed.
EDIT: fix uppercase names.
The managed memory could go to a second RFC as it would be more involved than renaming
Thank you all for the discussion! I will create a PR to do the renaming only for now.
Implemented by #68
This RFC proposes to rename
kDLGPU
tokDLCUDA
, andkDLCPUPinned
tokDLCUDAHost
. Two main reasons for this renaming:torch.cuda
to support CUDA tensor types, so this renaming will make it more consistent with the other frameworks.Look forward to hearing your thoughts!