utterances-bot commented 11 months ago

Automatic differentiation from scratch | Emilio’s Blog

Automatic differentiation (AD) is one of the most important features present in modern frameworks such as Tensorflow, Pytorch, Theano, etc. AD has made paramter optimization through gradient descent an order of magnitude faster and easier, and drastically lowered the barrier of entry for people without a solid mathematical background. In spite of its utility, AD is surprisingly simple to implement, which is what we are going to do here.

https://e-dorigatti.github.io/math/deep%20learning/2020/04/07/autodiff.html

fate-ubw commented 11 months ago

This blog is the best I have ever read. But here are some mistake in trainning softmax class fix method.

        idx = random.sample(range(len(X)), len(X))
        for i in range(0, len(X), self._batch_size):
            loss = self._sgd_step(
                X[i:i+self._batch_size],
                y[i:i+self._batch_size],
                lr=self._base_lr / math.sqrt(1 + epoch)
            )

the batch data X[i:i+self._batch_size] and y[i:i+self._batch_size] is wrong. Becaues each batch should be random sampleing form source data. Here is my version of fit method:

def fit(self, X, y):
    ''' trains the model on the given data '''
    history = []
    batch_count = 0
    for epoch in range(self._epochs): 
        # shuffle indices
        # pdb.set_trace()
        idx = random.sample(range(len(X)), len(X)) 
        for i in range(0, len(X), self._batch_size): 
            random_idx = idx[i:i+self._batch_size]
            loss = self._sgd_step( 
                np.array([X[i] for i in random_idx]), 
                np.array([y[i] for i in random_idx]),
                lr=self._base_lr / math.sqrt(1 + epoch)
            )
            history.append(loss) 

        if epoch % 5 == 0:
            print(f'epoch: {epoch}\tloss: {loss:.3f}')

    return history

e-dorigatti commented 11 months ago

Hi @fate-ubw, I am glad you liked my post! And thank you for the correction, indeed random sampling is the best thing to do. If you do not mind, I updated my post with your suggestion ^^

fate-ubw commented 11 months ago

Hi @fate-ubw, I am glad you liked my post! And thank you for the correction, indeed random sampling is the best thing to do. If you do not mind, I updated my post with your suggestion ^^

sure! I have learned a lot from your post, waitting for your next post~

e-dorigatti / e-dorigatti.github.io

math/deep%20learning/2020/04/07/autodiff #13

Automatic differentiation from scratch | Emilio’s Blog