Open yhaddbb opened 6 months ago
Plotting is definitely possible but not planned yet. The parameters in spline_weight
is the coefficients of the splines, for now dumping that out and you will be able to plot the grid on your own.
For the pruning, it still needs to validate if my "efficient" way of doing sparsifying regularization works as expected. If not, the trained network might be badly redundant, unlike the cases in original paper.
Plotting is definitely possible but not planned yet. The parameters in
spline_weight
is the coefficients of the splines, for now dumping that out and you will be able to plot the grid on your own.For the pruning, it still needs to validate if my "efficient" way of doing sparsifying regularization works as expected. If not, the trained network might be badly redundant, unlike the cases in original paper.
Thank you for your reply
Plotting is definitely possible but not planned yet. The parameters in
spline_weight
is the coefficients of the splines, for now dumping that out and you will be able to plot the grid on your own.For the pruning, it still needs to validate if my "efficient" way of doing sparsifying regularization works as expected. If not, the trained network might be badly redundant, unlike the cases in original paper.
Specifically, how should we use the parameter ‘spline_weight’ to draw the shape of the activation function on each edge?
I write some code to visualize the activation function on each edge, but it seems not right. If somebody knows how to modify it, welcome to chat. 😄
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import BSpline
def visualize_kan(weight):
# define B-spline parameters
grid_size = 5
spline_order = 3
weights = weight
# define knot vector
knot_vector = np.concatenate(([-1] * spline_order, np.linspace(-1, 1, grid_size), [1] * spline_order))
# define parameter range
t = np.linspace(-1, 1, 100)
# create B-spline object
spline = BSpline(knot_vector, weights, spline_order)
# calculate B-spline curve values
spline_values = spline(t)
# add bias
silu = nn.SiLU()
bias = silu(torch.tensor(t))
spline_values = spline_values + bias.numpy()
# plot B-spline curve
plt.figure(figsize=(8, 6))
plt.plot(t, spline_values, label='B-spline curve')
plt.scatter(np.linspace(-1, 1, len(weights)), weights, color='red', label='Control points')
plt.title('B-spline Curve')
plt.xlabel('t')
plt.ylabel('Value')
plt.legend()
plt.grid(True)
plt.show()
for layer in kan_model.layers:
for i in range(5):
for j in range(2):
visualize_kan(layer.spline_weight[i][j].detach().numpy())
mark
I write some code to visualize the activation function on each edge, but it seems not right. If somebody knows how to modify it, welcome to chat. 😄
import numpy as np import matplotlib.pyplot as plt from scipy.interpolate import BSpline def visualize_kan(weight): # define B-spline parameters grid_size = 5 spline_order = 3 weights = weight # define knot vector knot_vector = np.concatenate(([-1] * spline_order, np.linspace(-1, 1, grid_size), [1] * spline_order)) # define parameter range t = np.linspace(-1, 1, 100) # create B-spline object spline = BSpline(knot_vector, weights, spline_order) # calculate B-spline curve values spline_values = spline(t) # add bias silu = nn.SiLU() bias = silu(torch.tensor(t)) spline_values = spline_values + bias.numpy() # plot B-spline curve plt.figure(figsize=(8, 6)) plt.plot(t, spline_values, label='B-spline curve') plt.scatter(np.linspace(-1, 1, len(weights)), weights, color='red', label='Control points') plt.title('B-spline Curve') plt.xlabel('t') plt.ylabel('Value') plt.legend() plt.grid(True) plt.show() for layer in kan_model.layers: for i in range(5): for j in range(2): visualize_kan(layer.spline_weight[i][j].detach().numpy())
HI, I used your code and it feels pretty good. Can you say what is incorrect?
I write some code to visualize the activation function on each edge, but it seems not right. If somebody knows how to modify it, welcome to chat. 😄
import numpy as np import matplotlib.pyplot as plt from scipy.interpolate import BSpline def visualize_kan(weight): # define B-spline parameters grid_size = 5 spline_order = 3 weights = weight # define knot vector knot_vector = np.concatenate(([-1] * spline_order, np.linspace(-1, 1, grid_size), [1] * spline_order)) # define parameter range t = np.linspace(-1, 1, 100) # create B-spline object spline = BSpline(knot_vector, weights, spline_order) # calculate B-spline curve values spline_values = spline(t) # add bias silu = nn.SiLU() bias = silu(torch.tensor(t)) spline_values = spline_values + bias.numpy() # plot B-spline curve plt.figure(figsize=(8, 6)) plt.plot(t, spline_values, label='B-spline curve') plt.scatter(np.linspace(-1, 1, len(weights)), weights, color='red', label='Control points') plt.title('B-spline Curve') plt.xlabel('t') plt.ylabel('Value') plt.legend() plt.grid(True) plt.show() for layer in kan_model.layers: for i in range(5): for j in range(2): visualize_kan(layer.spline_weight[i][j].detach().numpy())
HI, I used your code and it feels pretty good. Can you say what is incorrect?
I think the issue is that we only visualize the spline coefficients, but in the original implementation they visualize the activation function based on the pre- and post-activations (see also here) Can we access these two activation types in your code?
I write some code to visualize the activation function on each edge, but it seems not right. If somebody knows how to modify it, welcome to chat. 😄
import numpy as np import matplotlib.pyplot as plt from scipy.interpolate import BSpline def visualize_kan(weight): # define B-spline parameters grid_size = 5 spline_order = 3 weights = weight # define knot vector knot_vector = np.concatenate(([-1] * spline_order, np.linspace(-1, 1, grid_size), [1] * spline_order)) # define parameter range t = np.linspace(-1, 1, 100) # create B-spline object spline = BSpline(knot_vector, weights, spline_order) # calculate B-spline curve values spline_values = spline(t) # add bias silu = nn.SiLU() bias = silu(torch.tensor(t)) spline_values = spline_values + bias.numpy() # plot B-spline curve plt.figure(figsize=(8, 6)) plt.plot(t, spline_values, label='B-spline curve') plt.scatter(np.linspace(-1, 1, len(weights)), weights, color='red', label='Control points') plt.title('B-spline Curve') plt.xlabel('t') plt.ylabel('Value') plt.legend() plt.grid(True) plt.show() for layer in kan_model.layers: for i in range(5): for j in range(2): visualize_kan(layer.spline_weight[i][j].detach().numpy())
HI, I used your code and it feels pretty good. Can you say what is incorrect?
I think the issue is that we only visualize the spline coefficients, but in the original implementation they visualize the activation function based on the pre- and post-activations (see also here) Can we access these two activation types in your code?
After a careful reading of pykan's code, I realized that perhaps EffiecientKAN is difficult to visualize as well as the native pykan. EffiecientKAN actually weights the results of all B-spline functions for all nodes to produce output directly, in order to speed up efficiency. This means that the spline_weight[i][j] in your code does not represent the spline coefficients for the [i][j]th node as native pykan does, and therefore you cannot plot the spline function directly with spline_weight[i][j]. I think EffiecientKAN perhaps speeds up the efficiency while reducing the interpretability of the model. If you know how to visualize on EffiecientKAN, please let me know.
I write some code to visualize the activation function on each edge, but it seems not right. If somebody knows how to modify it, welcome to chat. 😄
import numpy as np import matplotlib.pyplot as plt from scipy.interpolate import BSpline def visualize_kan(weight): # define B-spline parameters grid_size = 5 spline_order = 3 weights = weight # define knot vector knot_vector = np.concatenate(([-1] * spline_order, np.linspace(-1, 1, grid_size), [1] * spline_order)) # define parameter range t = np.linspace(-1, 1, 100) # create B-spline object spline = BSpline(knot_vector, weights, spline_order) # calculate B-spline curve values spline_values = spline(t) # add bias silu = nn.SiLU() bias = silu(torch.tensor(t)) spline_values = spline_values + bias.numpy() # plot B-spline curve plt.figure(figsize=(8, 6)) plt.plot(t, spline_values, label='B-spline curve') plt.scatter(np.linspace(-1, 1, len(weights)), weights, color='red', label='Control points') plt.title('B-spline Curve') plt.xlabel('t') plt.ylabel('Value') plt.legend() plt.grid(True) plt.show() for layer in kan_model.layers: for i in range(5): for j in range(2): visualize_kan(layer.spline_weight[i][j].detach().numpy())
HI, I used your code and it feels pretty good. Can you say what is incorrect?
I think the issue is that we only visualize the spline coefficients, but in the original implementation they visualize the activation function based on the pre- and post-activations (see also here) Can we access these two activation types in your code?
After a careful reading of pykan's code, I realized that perhaps EffiecientKAN is difficult to visualize as well as the native pykan. EffiecientKAN actually weights the results of all B-spline functions for all nodes to produce output directly, in order to speed up efficiency. This means that the spline_weight[i][j] in your code does not represent the spline coefficients for the [i][j]th node as native pykan does, and therefore you cannot plot the spline function directly with spline_weight[i][j]. I think EffiecientKAN perhaps speeds up the efficiency while reducing the interpretability of the model. If you know how to visualize on EffiecientKAN, please let me know.
Yes, you are right. I also found that the spline_weight[i][j] in EffiecientKAN not represent the spline coefficients for the [i][j]th node as native pykan does. Now, I use some tricks to visualize activation function. I just direct compute the output of the kan using range from (-1, 1).
t = torch.arange(-2, 2, 0.01).cuda()
fig, ax = plt.subplots(1, 1, figsize=(10, 5))
plt.plot(t.detach().cpu().numpy(), net.kan_layer(t.unsqueeze(1)).detach().cpu().numpy(), label="KAN layer", color="red")
plt.ylabel("Output")
If you have multi-layers, you can direct choose which layer you want to visualize:
kan_model = KAN([2, 5, 1], base_activation=nn.Identity)
# define layer
layer = 0
# define which input
input_node = 0
# define hidden node
hidden_node = 5
t = torch.arange(-2, 2, 0.01).cuda()
fig, ax = plt.subplots(1, 1, figsize=(10, 5))
plt.plot(t.detach().cpu().numpy(), kan_model.layers[layer](t.unsqueeze(1))[hidden_node][input_node].detach().cpu().numpy(), label=f"KAN layer {layer} {input_node} {hidden_node}", color="red")
plt.ylabel("Output")
mark
我编写了一些代码来可视化每个边上的激活函数,但似乎不正确。如果有人知道如何修改它,欢迎聊天。😄
import numpy as np import matplotlib.pyplot as plt from scipy.interpolate import BSpline def visualize_kan(weight): # define B-spline parameters grid_size = 5 spline_order = 3 weights = weight # define knot vector knot_vector = np.concatenate(([-1] * spline_order, np.linspace(-1, 1, grid_size), [1] * spline_order)) # define parameter range t = np.linspace(-1, 1, 100) # create B-spline object spline = BSpline(knot_vector, weights, spline_order) # calculate B-spline curve values spline_values = spline(t) # add bias silu = nn.SiLU() bias = silu(torch.tensor(t)) spline_values = spline_values + bias.numpy() # plot B-spline curve plt.figure(figsize=(8, 6)) plt.plot(t, spline_values, label='B-spline curve') plt.scatter(np.linspace(-1, 1, len(weights)), weights, color='red', label='Control points') plt.title('B-spline Curve') plt.xlabel('t') plt.ylabel('Value') plt.legend() plt.grid(True) plt.show() for layer in kan_model.layers: for i in range(5): for j in range(2): visualize_kan(layer.spline_weight[i][j].detach().numpy())
嗨,我用了你的代码,感觉还不错。你能说出什么是不正确的吗?
我认为问题是我们只可视化样条系数,但在原始实现中,它们根据激活前和激活后可视化激活函数(另请参阅此处)我们可以在您的代码中访问这两种激活类型吗?
在仔细阅读了 pykan 的代码后,我意识到 EffiecientKAN 可能很难像原生 pykan 那样可视化。EffiecientKAN 实际上对所有节点的所有 B 样条函数的结果进行加权,以直接产生输出,以提高效率。这意味着代码中的 spline_weight[i][j] 不像本机 pykan 那样表示第 [i][j] 个节点的样条系数,因此您不能直接使用 spline_weight[i][j] 绘制样条函数。我认为 EffiecientKAN 可能会提高效率,同时降低模型的可解释性。如果您知道如何在 EffiecientKAN 上可视化,请告诉我。
是的,你是对的。我还发现 EffiecientKAN 中的 spline_weight[i][j] 并不像原生 pykan 那样表示 [i][j] 个节点的样条系数。现在,我使用一些技巧来可视化激活函数。我只是使用 (-1, 1) 的范围直接计算 kan 的输出。
t = torch.arange(-2, 2, 0.01).cuda() fig, ax = plt.subplots(1, 1, figsize=(10, 5)) plt.plot(t.detach().cpu().numpy(), net.kan_layer(t.unsqueeze(1)).detach().cpu().numpy(), label="KAN layer", color="red") plt.ylabel("Output")
如果有多层,则可以直接选择要可视化的图层:
kan_model = KAN([2, 5, 1], base_activation=nn.Identity) # define layer layer = 0 # define which input input_node = 0 # define hidden node hidden_node = 5 t = torch.arange(-2, 2, 0.01).cuda() fig, ax = plt.subplots(1, 1, figsize=(10, 5)) plt.plot(t.detach().cpu().numpy(), kan_model.layers[layer](t.unsqueeze(1))[hidden_node][input_node].detach().cpu().numpy(), label=f"KAN layer {layer} {input_node} {hidden_node}", color="red") plt.ylabel("Output")
I'm not sure that's correct. When using KAN, what we usually want to visualize is the activation function on the edge from the input node to the output node. Whereas this approach seems to visualize the function after all the activation functions are combined:”plt.plot(t.detach().cpu().numpy(), net.kan_layer(t.unsqueeze(1)).detach().cpu().numpy(), label="KAN layer", color="red")”. Also, the line of code doesn't look right: "kan_model.layerslayer[hidden_node [input_node].detach().cpu().numpy()" . The format of the output doesn't seem to be [hidden_node [input_node]
Assuming that the model has a hidden layer [3,1] and the number of variables nvars=5
, then the relationship from a specific column from_node_i
to the target node target_node_j
in the first hidden layer may be represented like this. I’m not too confident about this solution, but I hope this helps a bit.
nvars = 5
number_of_layer = 0
from_node_i = 2
target_node_j = 1
arr = torch.arange(-1,1,0.01)
N = arr.shape[0]
X = torch.zeros(N, nvars)
X[:,from_node_i] = arr
Y = model.layers[number_of_layer](X)
Y = Y[:, target_node_j]
plt.plot(arr.detach().numpy(), Y.detach().numpy())
I notice that most image classification tasks are based on efficient-kan instead of original kan. I want to know if it is possible to plot and prune the efficient-kan just like the examples in original kan.