Closed MBrouns closed 3 years ago
Not sure if you like the current behaviour, but it can be slightly annoying because subsequent models can break on this. I encountered this because lime does quite aggressive modifications that sometimes result in an empty string.
>>> from whatlies.language import BytePairLanguage >>> BytePairLanguage('nl').fit(['foo', 'bar']).transform([""]) array([[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]], dtype=float32)
Not sure if you like the current behaviour, but it can be slightly annoying because subsequent models can break on this. I encountered this because lime does quite aggressive modifications that sometimes result in an empty string.