Inspired by FourierKAN, we attempted to use Taylor series expansion to accomplish tasks on the MNIST dataset. This code only accomplishes a small task, we are working on trying more complex tasks and continuously optimizing the Taylor method.
This project is inspired by the FourierKAN in the following repositories:
We conduct experiments with various neural networks on the MNIST dataset and function fitting tasks, including traditional MLP, CNN, TaylorKAN, and FourierKAN. The objective is to evaluate and compare the performance of these models in terms of training loss, test loss, training accuracy, test accuracy, and total training time。
The following models were trained and evaluated:
MLP: A Multi-Layer Perceptron with two hidden layers.
CNN: A Convolutional Neural Network with two convolutional layers.
3Order_TaylorNN: A TaylorKAN with order 3.
2Order_TaylorNN: A TaylorKAN with order 2.
CNNFourierKAN: A CNN with FourierKAN Layers.
2Order_TaylorCNN: A CNN with 2-order TaylorKAN Layers.
The MNIST dataset, consisting of 28x28 grayscale images of handwritten digits, was used for training and evaluation.
Figure 1: Loss&Accuracy over epochs for different models.
Target: $f(x_1,x_2,x_3,x_4)={\rm exp}({\rm sin}(x_1^2+x_2^2)+{\rm sin}(x_3^2+x_4^2))$
Figure 2: Functinon 1 Fitting of MLP .
Figure 3: Functinon 1 Fitting of CNN .
Figure 4: Functinon 1 Fitting of 3Order_TaylorKAN .
Figure 5: Functinon 1 Fitting of 3Order_TaylorMLP .
Target: $f(x_1,x_2,x_3,x_4)={\rm exp}({\rm sin}(x_1^2+x_2^2)+{\rm sin}(x_3^2+x_4^2)+x_1^5+x_2^4 \cdot x_3^3+{\rm log}(1+|x_4|))$
Figure 6: Functinon 2 Fitting of MLP .
Figure 7: Functinon 2 Fitting of CNN .
Figure 8: Functinon 2 Fitting of 3Order_TaylorKAN .
Figure 9: Functinon 2 Fitting of 3Order_TaylorMLP .
The TaylorLayer
class is defined to compute the Taylor series expansion up to the specified order:
class TaylorLayer(nn.Module):
def __init__(self, input_dim, out_dim, order, addbias=True):
super(TaylorLayer, self).__init__()
self.input_dim = input_dim
self.out_dim = out_dim
self.order = order
self.addbias = addbias
self.coeffs = nn.Parameter(torch.randn(out_dim, input_dim, order) * 0.01)
if self.addbias:
self.bias = nn.Parameter(torch.zeros(1, out_dim))
def forward(self, x):
shape = x.shape
outshape = shape[0:-1] + (self.out_dim,)
x = torch.reshape(x, (-1, self.input_dim))
x_expanded = x.unsqueeze(1).expand(-1, self.out_dim, -1)
y = torch.zeros((x.shape[0], self.out_dim), device=x.device)
for i in range(self.order):
term = (x_expanded ** i) * self.coeffs[:, :, i]
y += term.sum(dim=-1)
if self.addbias:
y += self.bias
y = torch.reshape(y, outshape)
return y
This project is licensed under the MIT License.