Open smilealvin92 opened 6 months ago
Ok let me check this.
Can you give a very small sample that reproduces it?
just use mnist.py is enough to reproduce this error. And comment out native_batch_norm and its backward operator
Ok let me check this.
Can you give a very small sample that reproduces it?
thanks for your attention
Hi, I want to update an error about this. If we comment out the native_batch_norm and native_batch_norm_backward in norm_ops.cpp, we could found that the tranning phase goes well while test phase goes down in BN operator forwarding. The direct error come out from "Buffer is not valid for unallocated device" and this is because the TO op in line "c10::IValue(returns[idx].toTensor().to(*tgt_device));" in pytorch/aten/src/ATen/native/CPUFallback.cpp which trigger a copy to device.Maybe I should submit this issue in Pytorch repo, because this is a fallback error. Hope you would try this error.![image](https://github.com/artyom-beilis/pytorch_dlprim/assets/48876213/1bb2d5b5-b795-47b6-bd33-bc24c38b445b)