bootphon / wordseg

A Python toolbox for text based word segmentation
https://docs.cognitive-ml.fr/wordseg
GNU General Public License v3.0
16 stars 7 forks source link

add a line of code to prepare (to remove empty lines) to prevent gold-segmented mismatch? #19

Closed alecristia closed 6 years ago

alecristia commented 6 years ago

since the segmentation process removes from consideration empty lines (correctly) it may make sense to remove empty lines at the "prepare" stage, so as to avoid: fatal error: gold and train have different size: len(gold)=310, len(train)=309