Kernel Dying while training KAN

KindXiaoming / pykan

Kolmogorov Arnold Networks

MIT License

13.85k stars 1.24k forks source link

Kernel Dying while training KAN #284

Open SajjadSA01 opened 3 weeks ago

SajjadSA01 commented 3 weeks ago

When i run the model on colab its session crashes saying available ram is full Is there any way to reduce the memory usage

Arjeus commented 3 weeks ago

There may already be too much parameters being optimized. For example I accidentally raised k=200 instead of grid value, thus crashing the session.

Can you elaborate on your work?

SajjadSA01 commented 3 weeks ago

i am working on a binary classification problem on a data with around 190000 observation and 406 features in which 69 are catgorical variables and rest are continuos.

model = KAN(width=[X_train.size(1), 16, 2], grid=15, k=5) results = modeltrain(dataset, opt="Adam", steps=20, batch=128, metrics=(train_acc, test_acc), loss_fn=torch.nn.CrossEntropyLoss()) This is how i defined the model and training

There may already be too much parameters being optimized. For example I accidentally raised k=200 instead of grid value, thus crashing the session.

Can you elaborate on your work?

Arjeus commented 3 weeks ago

The number of input features are quite relatively large and given the size of the grid and value of k. To isolate the issue, try setting the grid and k to a lower value, (grid=5 and k=3) and see if it fits the memory. Try it to run on CPU as well to check the amount of RAM used, as GPU RAM is less than main memory RAM.

hjzhannah commented 3 weeks ago

I have a similar issue. Is there a way to run KAN on multiple CPUs?

KindXiaoming commented 6 days ago

if you update to the most recent codes, use model = KAN(..., save_plot_data=False) can reduce memory to a large extent. But in these case, regularizations are no longer allowed (meaning lamb=0.0 in model.fit(), previously known as model.train()).

KindXiaoming commented 6 days ago

there's also this speed mode which you may want to try model = model.speed() Example: https://github.com/KindXiaoming/pykan/blob/master/tutorials/Example_2_speed_up.ipynb

which reduces memory and training time to a great extent. But regularizations are not allowed and symbolic front is disabled in this mode.