Open GuoMinghui07 opened 4 months ago
Hi, I was able to make a potential slight fix by modifying the exec calls in MultiKAN.py and setting environment vars, but in the paper they mention that, "One major reason why KANs run slowly is because different activation functions cannot leverage batch computation (large data through the same function)."
As such, I was wondering doesn't this imply there's not much benefit to even trying to use CUDA and training a model on GPU? From my interpretation/observation, I've observed faster (and cheaper) training by using a fast CPU (Used an AzureML VM with the Intel® Xeon® Platinum 8272CL (Cascade Lake) processors). Perhaps I am misunderstanding something, would just like to know whether pursuing fixing GPU related issues would be better for long-term or just focusing on utilizing a fast CPU
Just fixed (I think). Please try run this tutorial and let me know if it works on your end. Thanks! https://github.com/KindXiaoming/pykan/blob/master/tutorials/API_10_device.ipynb
@KindXiaoming Thanks, It's working but there still have the same problem in these pruning and symbolic method:
model=model.prune()
model.auto_symbolic()
model.suggest_symbolic()
model.fix_symbolic()
...
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! After I trained the model on CUDA to do some symbolic regression tasks.
For example, this is task of solving the ### Helmholtz Equation:
The true solution is:
f(x,y) = \sin(x) \cdot \sin(4y)
I had the same problem. model=model.prune() model.auto_symbolic() model.suggest_symbolic() model.fix_symbolic()can't work in cuda
Hi! I had the same problem with pruning the model:
update to version 0.2.5 (latest) and it should work now.
When I try to train the model on GPU: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
There seems to be a problem with the internal function implementation.