Closed AmitMY closed 1 week ago
I'm having the same problem bro.QAQ
Retraining after symbolic regression indeed can be tricky sometime. because of this line
fixing (0,0,1) with log, r2=0.9995437326827541
logairthim is not defined if its input is <=0. Sometimes by converting from LBFGS to Adam would help. This can be done with model.train(opt='Adam')
. I mentioned this problem in this examples, and it is unclear to me how to fix this in general.
However, two hacky ways to fix this is: if you don't have strong reasons to keep those potentially singular functions, you could
(1) inspect and replace those functions that potentially lead to singularity (log,sqrt, x^-1 etc.) via model.fix_symbolic(0,0,1,f)
where f
can be any function from the top of the list if you run model.suggest_symbolic(0,0,1)
. For example, if you found x^2
can also fit (0,0,1) quite well, you may run fix_symbolic(0,0,1,'x^2')
and then retriaining should not have a problem because x^2
is not singular.
(2) you may remove singular functions from your symbolic library in the first place, e.g., you confine your symbolic formulas only to sin, squared, and exp.
lib = ['sin', 'x^2', 'exp']
model.auto_symbolic(lib=lib)
I had the same problem when I tried to run hellokan.ipynb, but it worked with the optimizer set to Adam.
model.train(dataset, opt="Adam", steps=50);
I think the nan values come from overflow values after a few iterations under some setups, for example if a large k is used, the following lines would report that there is an illegal value:
[c:\...\pykan\venv\Lib\site-packages\kan\KANLayer.py:220](file:///C:/Desktop/School/Papers/extra/pykan/venv/Lib/site-packages/kan/KANLayer.py:220), in KANLayer.update_grid_from_samples(self, x)
[218] grid_uniform = torch.cat([grid_adaptive[:, [0]] - margin + (grid_adaptive[:, [-1]] - grid_adaptive[:, [0]] + 2 * margin) * a for a in np.linspace(0, 1, num=self.grid.shape[1])], dim=1)
[219] self.grid.data = self.grid_eps * grid_uniform + (1 - self.grid_eps) * grid_adaptive
--> [220] self.coef.data = curve2coef(x_pos, y_eval, self.grid, self.k, device=self.device)
...
[134] mat = B_batch(x_eval, grid, k, device=device).permute(0, 2, 1)
--> [135] coef = torch.linalg.lstsq(mat.to('cpu'), y_eval.unsqueeze(dim=2).to('cpu')).solution[:, :, 0] # sometimes 'cuda' version may diverge
[136] return coef.to(device)
Not sure how this can occur though, there are way too many code to go through xd
I train a KAN using 19 inputs, 5 hidden neurons, and 1 output (I know that I only need a subset of this input, and I was hoping the KAN will tell me which).
I train by refining the grid
5, 10, 20, 50
with sub-optimal results:I run
auto_symbolic
as shown in the tutorial, and get all of the nodes "fixed"But then when training again, I immediately get nan:
If I only train once, with 5 grids, it does train even after
auto_symbolic