Open achen353 opened 3 years ago
@achen353 Hi Andrew, thanks for your interest! I'll be instrumenting the full training script this weekend :) stay tuned..
@lucidrains Thanks Phil that would be great. My GPU doesn't have PyTorch support yet so I'm working to implementing the model on TF. One idea I had was to create a fixed-length batch, fill the unused tail length of each sequence with 0 and make sure that their corresponding masks are "False". This would require each encoding value to be bumped up by one so value 0 doesn't represent any information.
Yup, that is the right way to go about it! I'll create all the necessary tools for training end to end without much coding, soon
Edit: And yup, you need to reserve 0 for padding and 1 for
How do you go about stacking unequal lengths of tokenized text to create batches of text for training?