Closed Priyansi closed 3 years ago
@Priyansi Thank you ! I left a few comments, mainly about idist
and maybe out of the scope. Anyway, it looks good to me !
@KickItLikeShika , @Ishan-Kumar2 could you please review and approuve this tutorial if it looks good to you. Thanks
Looks good, easy to understand!
A small suggestion: I think the 4 cells in Data Preprocessing can be combined into a Dataset class right? It is sort of standard to have that class in PyTorch codes so may make it easier for exisiting PyTorch user to follow along.
Like this.
class IMDBDataset(torch.utils.data.Dataset):
def __init__(self, dataset):
self.dataset =
self.tokenizer =
...
def __getitem__(self, idx):
return self.tokenized_dataset[idx]
def __len__(self):
return len(self.dataset)
Let me know what you all think
Edit :Also the notebook can be re-run to fix cell execution numbers
Thanks for the suggestion @Ishan-Kumar2 ! I was following this Hugging Face tutorial for reference code so that it's easier for readers to compare and contrast since this is a beginner tutorial. But if you think it'd be better to combine them into a custom dataset, I'd be happy to change the code.
Also, let me re-run the notebook!
Resolves #15