cbaziotis / ekphrasis

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
MIT License
660 stars 91 forks source link

she's --> she ' s #8

Open CarloSegat opened 5 years ago

CarloSegat commented 5 years ago

Running the text_processor example in the README with the input "she's" returns ["she", "'", "s"]. Is this expected behavior? I'd have taught "she" "is".

cbaziotis commented 5 years ago

Have you set unpack_contractions=True?

DebayanChakraborty commented 4 years ago

I also have the same issue . I have set unpack_contractions to True. Still i am facing this issue .

AzharSultan commented 2 years ago

I think this is happening because of difference between ' and . Right now unpack_contractions only checks for ' .