GokuMohandas / Made-With-ML

Learn how to design, develop, deploy and iterate on production-grade ML applications.
https://madewithml.com
MIT License
37.52k stars 5.95k forks source link

Foundations -> Utilities Errors and questions #201

Closed gitgithan closed 3 years ago

gitgithan commented 3 years ago
  1. Under def predict_step, z = F.softmax(z).cpu().numpy() is shown on webpage. Notebook correctly assigns to y_prob = F.softmax(z).cpu().numpy() though

  2. Extra single quote after "k" Syntax Error plt.scatter(X[:, 0], X[:, 1], c=[colors[_y] for _y in y], s=25, edgecolors="k"') (happens 1x here, 2x in Data Quality page)

  3. Why did the softmax get manually calculated in Numpy section of Neural Networks page, but here in def train_step, the raw logits were passed directly at

    z = self.model(inputs)  # Forward pass
    J = self.loss_fn(z, targets)  # Define loss

    without a apply_softmax = True

  4. Why did train_step's Loss need J.detach().item() but eval_step used J directly without detach and item

  5. In the collate_fn, batch = np.array(batch, dtype=object) was used but i didn't understand why convert to object. Adding a note on what happens without it VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. would be very helpful in preparing students for ragged tensors and padding in CNN/RNN later

  6. I was wondering why stack X and y. It seems that X is necessary because without stacking, float casting in X = torch.FloatTensor(X.astype(np.float32) breaks with ValueError: setting an array element with a sequence. because batch[:,0] indexing creates nested numpy array objects that can't be casted, but this nested array thing will not occur for y during batch[:,1], because y begun as a 1d object already, so no nested array, so no problem casting, so there's no need to stack y? (same for CNN stacking y) This question came about when going through CNN and thinking why was there no X stacking there. Then I realized int casting worked there because padded_sequences = np.zeros begun without nesting, and also numpy was able to implicitly flatten the sequence numpy array during padded_sequences[i][:len(sequence)] = sequence.

GokuMohandas commented 3 years ago
  1. good catch. I was fixing notations and missed this one.
  2. fixed them all.
  3. I definitely need to go fix all of these. I'll be removing that apply_softmax flag all together and just using z and y_prob to differentiate between logits and probabilities (softmax applied to logits). I'll moved softmax outside of the forward pass. I started doing this in the MLOps lessons but haven't gone back and edited these yet or I replaced just a few of them. I'll at least update the webpage now since I'll be moving those directly into new notebooks this winter.
  4. I'm using detach to break any gradient attachment since that cumulative loss doesn't need to be part of the computational graph for backprop. During eval, no gradient book keeping is done regardless.
  5. I think this used to be required for a previous version of numpy that colab required. Basically using dtype object means your pointing to objects stored somewhere else instead of the objects directly. But for this version of numpy, you can remove it.
  6. yup you got it. Necessary for X but I do it just for visual consistency but don't need it. I fixed this now in the webpage and will change notebooks this winter.