Open Patrick-Wen opened 7 months ago
Hi @Patrick-Wen,
Thank you for reporting this.
You're absolutely right, it should be "C-1" instead of "Nc-1" in the upper part of the summation. I will fix it in the notebook and add to the list of typos to be addressed in a future revision.
Thanks for supporting my work :-)
Best, Daniel
As my reading of the book continues, I may wish to bring up some possible typos I figured out along the way.
In Page 530 of the book establishing the class Inception, here I suspect 2@HxW might have been mistakenly put as 1@HxW:
Page 695-6 put: "The hidden state, however, was entirely handled by the model itself using its hidden_state attribute", where hidden_state was printed in a different font from other texts, suggesting hidden_state is an attribute defined in the code. However, neither the Decoder class in Page 694 nor the for loop in Page 695 included hidden_state. If I understand correctly, hidden_state may need to be replaced with hidden, which is defined in Page 694 as follows:
As an anverage mind trying to learn PyTorch seriously through this book, may I express some of my personal views on several improvements:
Here are two figures from Pages 594-5: In Figure 8.6, h0 and h1 are two different hidden layers. However, In Figure 8.7, h0 and h1 are two dimensions of a single hidden layer. If this understanding is correct, I would say it is possbly somewhat misleading to have h0 and h1 with different meanings in two consecutive figures. For those acquainted with RNN, this should not be an issue at all. But for beginners of deep learning, I am not sure if I am the only one who would get confused at the first sight (although I was able to correct myself at a second thought).
For several examples in Chater 8 and 9, the input features have a dimension of 2 while the dimension of hidden layers is also set to 2. While I was trying to understand the code, I tried to change the hidden layer dimension to some number other than 2. The resulting output gives me an better understanding of the shape of model input and output, which ultimately entails a better understanding of the model. It might be a somewhat suboptimal idea to learn from examples whose feature dimension and hidden dimesion happen to be identical, for green hands.
Please let me know if my suggested typos and suggestions are wrong due to my misunderstanding of the substance knowledge. Thank you.
Hi @Patrick-Wen,
Thank you for pointing out these issues and for asking these questions, let me go through all of them:
hidden
, not hidden_state
, to reflect the actual name of the attributeHope it helps, and thank you for supporting my work :-) Best, Daniel
tokenizer(sentence1, sentence2)
may be updated to joined_sentences = tokenizer(sentence1, sentence2)
since joined_sentences
was used in the immediate following code snippet;np.all(full_embeddings[alice_idx] == glove['alice'])
needs to be updated to np.all(extended_embeddings[alice_idx] == glove['alice'])
since full_embeddings
is not defined anywhere. train_labels = train_dataset['labels']
returned error " KeyError: "Column labels not in the dataset. Current columns in the dataset: ['sentence', 'source']" ". I am not sure whether the code should be updated to train_labels = train_dataset['source']
or the source
needs to be converted to some categorical labels so that train_labels = train_dataset['labels']
could be implemented without any error.The first two issues are restricted to code in the book, everything is fine with the full script in GitHub. Issue four exists both in the book and full script in GitHub.
Hi @Patrick-Wen ,
Thank you for pointing those out.
Regarding item number 3, I was referring to the fact of "padding and truncating" because it is the same procedure even if we're using a different max length. I'm not an native speaker either, but the proofreader said it was fine so I kept it like that :-)
Regarding item number 4, I've just run Chapter 11's notebook on Colab and I did not get that error. For some reason, it looks like the following code (from page 894) was not executed on your end:
def is_alice_label(row):
is_alice = int(row['source'] == 'alice28-1476.txt')
return {'labels': is_alice}
dataset = dataset.map(is_alice_label)
The code creates the missing labels
key in the dataset and it should make it work as intended. Hope it helps!
Best, Daniel
Hi Daniel,
My fault for missing the is_alice_label function. Thank you for the information.
Best regards, Patrick
Here is Eq 5.6 in the book:
It is stated that "C stands for the number of classes". I think the Nc, which represents the number of cases in the cth class, should be replaced by C. Nc is simply irrelevant here since softmax is calucated per individual.
Please correct me if I am wrong. Thank you.
Patrick Wen