ucbrise / actnn

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
MIT License
196 stars 30 forks source link

RuntimeError: Expected kernel_size[0] * kernel_size[1] < 16 to be true, but got false. (Could this error message be improved? If so, please report an enhancement request to PyTorch.) #10

Closed CuberrChen closed 3 years ago

CuberrChen commented 3 years ago

when i did't apply the actnn, there is no the issue. so I dont donw if there is a limitation of the kernel size if i want to use actnn?

merrymercy commented 3 years ago

This error is caused by our own implementation of MaxPool2d. It seems you are using max pooling with a large kernel size. Currently, our kernel implementation does not support this case. You can disable our quantized max-pooling layers by deleting these lines https://github.com/ucbrise/actnn/blob/aa4f43cce6631f53c3ed965b34b6de66bfe04c0a/actnn/actnn/module.py#L67-L72 But the memory usage will be higher.

We use our own implementation of MaxPool2d because we found the default one in PyTorch is not memory efficient. If you know how to write CUDA kernel, you can try to extend our code to support larger kernel sizes.

CuberrChen commented 3 years ago

Thank you very much for your detailed answer!