bsolomon1124 / demoji

Accurately find/replace/remove emojis in text strings
https://pypi.org/project/demoji/
Apache License 2.0
157 stars 20 forks source link

replace() leaves unicode variation selector-16 #25

Open ogslumber opened 3 years ago

ogslumber commented 3 years ago

Describe the bug replace() function leaves unicode variation selector-16 (\xef\xb8\x8f) when replacing Repeat Button emoji (🔁️).

To Reproduce

import demoji

sample_var = '🔁️ sample text'
print(sample_var.encode('utf-8'))
>>> b'\xf0\x9f\x94\x81\xef\xb8\x8f sample text'

sample_var = demoji.replace(sample_var)
print(sample_var.encode('utf-8'))
>>> b'\xef\xb8\x8f sample text'

Expected behavior String without \xef\xb8\x8f sequence:

>>> b' sample text'