EdwardDixon / snake

Inspired by "Neural Networks Fail to Learn Periodic Functions and How to Fix It"
MIT License
56 stars 8 forks source link

Stability of a #3

Open itsa-mee-mario opened 2 years ago

itsa-mee-mario commented 2 years ago

is there something that prevents a from exploding or vanishing?

ive been trying on my own implementation and seem to be getting nan values from snake. (the reason seems to be division or multiplication by large or tiny a)

our code looks very similar on the surface, and i couldnt find anything that prevents this is your code.

this isnt really an issue, but i didnt know how to ask.

here is what my implementation looks like. the only difference i see is that youre accounting for a being None, but im not (a is never none in my case)

class SnakeActivation(nn.Module):
    '''
    defines the snake activation function with learnable parameter a
    and returns x + (1/a)* sin^2(ax) 
    '''
    def __init__(self, a, in_features):
        super(SnakeActivation, self).__init__()

        self.a = nn.Parameter(
            torch.ones(in_features) * a, requires_grad=True
            )

    def snake(self, x):
        return x + (1/self.a)*torch.pow(torch.sin(self.a* (x)), 2)

    def forward(self, x):
        return self.snake(x)
lucasaraujodm commented 2 years ago

Try clipping the value of alpha to 0.05 < alpha < 50.

Also, consider reshaping your 'a' so you don't face any batch size broadcasting problem, something like the following:

def snake(self, x):
    alpha = T.clamp(self.a, 0.05, 50.)
    exp_shape = tuple([1 for _ in range(len(x.shape)-1)])
    alpha = alpha.view(exp_shape+alpha.shape)
    return x + (1/alpha)*torch.pow(torch.sin(alpha* (x)), 2)
EdwardDixon commented 4 months ago

Really cool PR from @klae01 just merged to main. Worth trying again. I'll try to get a new build of the package out this week,