Closed akolonin closed 5 years ago
After fixing, need to regenerate the data in http://langlearn.singularitynet.io/data/cleaned/English/Gutenberg-Children-Books/MSL5-25-2019JUN19/
Examples of files with blank lines: http://langlearn.singularitynet.io/data/cleaned/English/Gutenberg-Children-Books/MSL5-25-2019JUN19/cleaned-MSL5-2019JUN19/11-0.txt http://langlearn.singularitynet.io/data/cleaned/English/Gutenberg-Children-Books/MSL5-25-2019JUN19/cleaned-MSL10-2019JUN19/11-0.txt http://langlearn.singularitynet.io/data/cleaned/English/Gutenberg-Children-Books/MSL5-25-2019JUN19/cleaned-MSL10-2019JUN19/12-0.txt
Issue caused by lines that only contain characters that are removed by pre-cleaner. E.g. " "
Fixed in https://github.com/singnet/language-learning/pull/239
Data regenerated and uploaded to http://langlearn.singularitynet.io/data/cleaned/English/Gutenberg-Children-Books/
After fixing, need to regenerate the data in http://langlearn.singularitynet.io/data/cleaned/English/Gutenberg-Children-Books/MSL5-25-2019JUN19/
Examples of files with blank lines: http://langlearn.singularitynet.io/data/cleaned/English/Gutenberg-Children-Books/MSL5-25-2019JUN19/cleaned-MSL5-2019JUN19/11-0.txt http://langlearn.singularitynet.io/data/cleaned/English/Gutenberg-Children-Books/MSL5-25-2019JUN19/cleaned-MSL10-2019JUN19/11-0.txt http://langlearn.singularitynet.io/data/cleaned/English/Gutenberg-Children-Books/MSL5-25-2019JUN19/cleaned-MSL10-2019JUN19/12-0.txt