KindXiaoming / pykan

Kolmogorov Arnold Networks
MIT License
15.02k stars 1.39k forks source link

Training and test loss values are the square root of applied loss function #445

Open bfricz56 opened 2 months ago

bfricz56 commented 2 months ago

Hello all!

I tried to apply a mean square error function to my KAN as the loss function, then out of curiosity I checked how different the plots would be compared to the normal built-in root mean square error. I noticed that both plots looked the same with the same values. So I checked the code of MultKAN.py and found that in the results part of the program it takes the root of the loss function value:

results['train_loss'].append(torch.sqrt(train_loss).cpu().detach().numpy())

Because the loss function defining part only generates the mean square error:

if loss_fn == None: loss_fn = loss_fn_eval = lambda x, y: torch.mean((x - y) ** 2) else: loss_fn = loss_fn_eval = loss_fn

I imagine simply moving the square root part to the loss function would solve this issue and avoid possible future problems of non-proper evaluations of KAN performance during training.

Thanks for reading!

llkun578 commented 1 month ago

Thank you for your comment. I didn't understand what you meant. Isn't the default loss function of KAN just mean square error?

EthanWang11 commented 2 days ago

Thank you, I have the same problem as you.

bfricz56 commented 17 hours ago

Ethan if you formulate your loss function as the square of what you want to use then you can still get back your original loss function. It would be tricky to use one that can be negative though (e.g. R squared) as the square makes it a positive number, and it would return you a set of complex numbers without it when you call the training loss results.