Closed gitgithan closed 3 years ago
apply_softmax
flag all together and just using z
and y_prob
to differentiate between logits and probabilities (softmax applied to logits). I'll moved softmax outside of the forward pass. I started doing this in the MLOps lessons but haven't gone back and edited these yet or I replaced just a few of them. I'll at least update the webpage now since I'll be moving those directly into new notebooks this winter.
Under
def predict_step
,z = F.softmax(z).cpu().numpy()
is shown on webpage. Notebook correctly assigns toy_prob = F.softmax(z).cpu().numpy()
thoughExtra single quote after "k" Syntax Error
plt.scatter(X[:, 0], X[:, 1], c=[colors[_y] for _y in y], s=25, edgecolors="k"')
(happens 1x here, 2x in Data Quality page)Why did the softmax get manually calculated in Numpy section of Neural Networks page, but here in
def train_step
, the raw logits were passed directly atwithout a
apply_softmax = True
Why did
train_step
's Loss needJ.detach().item()
buteval_step
used J directly without detach and itemIn the
collate_fn
,batch = np.array(batch, dtype=object)
was used but i didn't understand why convert to object. Adding a note on what happens without itVisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated.
would be very helpful in preparing students for ragged tensors and padding in CNN/RNN laterI was wondering why stack X and y. It seems that X is necessary because without stacking, float casting in
X = torch.FloatTensor(X.astype(np.float32)
breaks withValueError: setting an array element with a sequence.
because batch[:,0] indexing creates nested numpy array objects that can't be casted, but this nested array thing will not occur for y during batch[:,1], because y begun as a 1d object already, so no nested array, so no problem casting, so there's no need to stack y? (same for CNN stacking y) This question came about when going through CNN and thinking why was there no X stacking there. Then I realized int casting worked there becausepadded_sequences = np.zeros
begun without nesting, and also numpy was able to implicitly flatten thesequence
numpy array duringpadded_sequences[i][:len(sequence)] = sequence
.