artyom-beilis / pytorch_dlprim

DLPrimitives/OpenCL out of tree backend for pytorch
http://blog.dlprimitives.org/
MIT License
227 stars 16 forks source link

implementation for aten::max.dim_max #29

Closed eazy-f closed 8 months ago

eazy-f commented 1 year ago

Ran into missing implementation issue with resnet34 training. The same code runs on CPU and CUDA without a problem.

Also noticed that aten::_copy_from_and_resize is missing, but I suppose a separate issue is required.

update: to be fair the functions isn't used by the resnet34 training itself, but by torch.max function called alongside. Having all auxiliary operations moved to CPU paved the way to successful training, though the GPU utilization seemed a bit low being around 30%.

artyom-beilis commented 1 year ago

Interesting... I didn't seen it. I'm going to check. Probably new functionality.

artyom-beilis commented 8 months ago

_copy_and_resize fixed so if something does not work feel free to reopen