Closed gonmelo closed 1 year ago
Exercise 4.5 - Suggest to check for the backend 'mps' or 'cuda' and alternatively run in GPU.
torch.backends.mps.is_available()
Let's change exercise 4.5 to model.prompt("The best part of traveling to Lisbon is", ...)
:)
There are some differences in the guide and notebooks. Please, check if the code is the same.
I have just a couple of comments about the tokenization section.
The answer to "why is tokenization important" doesn't include the fact that we need subword tokenization in order to represent any string with a finite vocabulary. It's basically impossible to build a reasonable language generation system without this. I could see the "official" answer being difficult to explain to students because it is not very specific.
Do we think it is necessary for the students to do stopword removal and stemming/lemmatization? These seem distracting, given that they wouldn't be performed for an LM task.