IvanDrokin / torch-conv-kan

This project is dedicated to the implementation and research of Kolmogorov-Arnold convolutional networks. The repository includes implementations of 1D, 2D, and 3D convolutions with different kernels, ResNet-like and DenseNet-like models, training code based on accelerate/PyTorch, as well as scripts for experiments with CIFAR-10 and Tiny ImageNet.
MIT License
372 stars 28 forks source link

problem with splitting x and groups with kan_conv #8

Closed suzannejin closed 2 months ago

suzannejin commented 2 months ago

This code split_x = torch.split(x, self.inputdim // self.groups, dim=1) will split x in various chunks. However, when groups=1,

for group_ind, _x in enumerate(split_x):
            y = self.forward_kan(_x, group_ind)

the above code will run into error, since

self.base_conv = nn.ModuleList([conv_class(input_dim // groups,
                                                   output_dim // groups,
                                                   kernel_size,
                                                   stride,
                                                   padding,
                                                   dilation,
                                                   groups=1,
                                                   bias=False) for _ in range(groups)])

base_conv will only create a modulelist of size 1.

IvanDrokin commented 2 months ago

@suzannejin Hi there, could you share a fully reproducible example of code? This error happens when there is a mismatch between layer input channels number and actual input Probably I should put an assertion here=)

suzannejin commented 2 months ago

You are right! The input channel was not correctly specified... However, I still got another error, even though I checked now that all dimensions should match (?)

grid = self.grid.view(*list([1 for _ in range(self.ndim + 1)] + [-1, ])).expand(target).contiguous().to(
RuntimeError: expand(torch.DoubleTensor{[1, 1, 12]}, size=[40, 12]): the number of sizes provided (2) must be greater or equal to the number of dimensions in the tensor (3)

My input x has dimensions (64, 4, 40). And I'm calling a model with several KANConv1DLayer:

model = Model_SimpleConvKAN(
    [4,6,8,10], 
    input_channels=4,
    num_classes=1
)
class Model_SimpleConvKAN(nn.Module):
    def __init__(
            self,
            layer_sizes,
            kernel_size: int = 3,
            num_classes: int = 10,
            input_channels: int = 1,
            spline_order: int = 3,
            degree_out: int = 2,
            groups: int = 1,
            dropout: float = 0.0,
            dropout_linear: float = 0.0,
            l1_penalty: float = 0.0,
            norm_layer: nn.Module = nn.BatchNorm1d
    ):
        super(Model_SimpleConvKAN, self).__init__()

        self.layers = nn.Sequential(
            KANConv1DLayer(
                input_channels, 
                layer_sizes[0], 
                kernel_size=kernel_size, 
                spline_order=spline_order, 
                groups=1,
                padding=1, 
                stride=1, 
                dilation=1, 
                norm_layer=norm_layer),
            L1(KANConv1DLayer(
                layer_sizes[0],
                layer_sizes[1],
                kernel_size=kernel_size, 
                spline_order=spline_order, 
                groups=groups,
                padding=1, 
                stride=2, 
                dilation=1, 
                dropout=dropout, 
                norm_layer=norm_layer),
               l1_penalty),
            L1(KANConv1DLayer(
                layer_sizes[1], 
                layer_sizes[2], 
                kernel_size=kernel_size, 
                spline_order=spline_order, 
                groups=groups,
                padding=1, 
                stride=2, 
                dilation=1, 
                dropout=dropout, 
                norm_layer=norm_layer),
               l1_penalty),
            L1(KANConv1DLayer(
                layer_sizes[2], 
                layer_sizes[3], 
                kernel_size=kernel_size, 
                spline_order=spline_order, 
                groups=groups,
                padding=1, 
                stride=1, 
                dilation=1, 
                dropout=dropout, 
                norm_layer=norm_layer),
               l1_penalty),
            nn.AdaptiveAvgPool1d((1, 1))
        )
        if degree_out < 2:
            self.output = nn.Sequential(nn.Dropout(p=dropout_linear), nn.Linear(layer_sizes[3], num_classes))
        else:
            self.output = KAN([layer_sizes[3], num_classes], dropout=dropout_linear,
                              first_dropout=True, spline_order=spline_order)

    def forward(self, x):
        x = self.layers(x)
        x = torch.flatten(x, 1)
        x = self.output(x)
        return x
suzannejin commented 2 months ago

hello! @IvanDrokin thank you so much for pointing out the problem with the dimensions. After solving few issues in the way I call the model here and there, I could manage to run the model!!