Closed RamoramaInteractive closed 2 years ago
these warnings shouldn't affect the execution of the training script. Yes, depending on the amount of training data, the execution time your report is quite normal. There are C++ implementations of BPE if speed is of the essence, e.g. fastBPE or YouTokenToMe.
I now added a progress bar (requiring tqdm) so that there's some feedback whether learn_bpe is still running or not.
I'm working on this Sockeye tutorial: https://awslabs.github.io/sockeye/tutorials/wmt.html
After running the preprocessing command, I have been getting this output for over 30 minutes.
Nothing has been changed yet. According to https://stackoverflow.com/questions/60945317/python-selenium-resourcewarning-enable-tracemalloc-to-get-the-object-allocati it's just a debug tool. Is it normal, that preprocessing the WMT17 data took so long?
I want to make sure that the subword-nmt is working properly.