Remove `to_(device)` API for all classes.

IvanaGyro commented 1 month ago

Moving data between different devices always allocates new memory. In-place version to_(device) doesn't prevent allocation. Python and C++ have the same syntax, with the to(device) method, to release the memory used before transfer.

my_unitensor = my_unitensor.to(Device.cuda)

Besides, managed memory is allocated when the device is GPU. CPU and GPUs can access managed memory, so it may not need to allocate any memory when switching devices if all data is stored in the managed memory. If all data is stored in the managed memory, the remaining to(device) can become an "in-place" method.

jeffry1829 commented 1 month ago

This function can be used in cases like: myunitensor.to(Device.cuda).other().yetanother() I think it’s ok to keep this?

IvanaGyro commented 1 month ago

Instead, we can use

my_unitensor = my_unitensor.to(Device.cuda).other_().yetanother_()

In the current implementation, to_ must allocate memory if moving to different devices, so the code above is not slower. Providing in-place to_ may mislead the user that to_ doesn't allocate new memory.

Cytnx-dev / Cytnx

Remove `to_(device)` API for all classes. #466