Closed andreaschandra closed 3 years ago
when using emoji_to_word the converted emoji has no whitespace to separate each emoji.
emoji_to_word
"untuk besok komen disini ya.. 😊😊😊😊🙏🙏🙏 https://t.cowok/nxeojVug3z "
utk besok komen disini ya.. "smiling_face_with_smiling_eyessmiling_face_with_smiling_eyessmiling_face_with_smiling_eyessmiling_face_with_smiling_eyesfolded_handsfolded_handsfolded_hands https://t.co/nxeojVug3z"
"smiling_face_with_smiling_eyes smiling_face_with_smiling_eyes smiling_face_with_smiling_eyes smiling_face_with_smiling_eyes folded_hands folded_hands folded_hands https://t.co/nxeojVug3z"
So that, we can tokenize the emoji word
DONE https://github.com/jakartaresearch/maleo/commit/085027a365fd5eabbbe9570fa1cdccde47141231
when using
emoji_to_word
the converted emoji has no whitespace to separate each emoji.input
"untuk besok komen disini ya.. 😊😊😊😊🙏🙏🙏 https://t.cowok/nxeojVug3z "
output
utk besok komen disini ya.. "smiling_face_with_smiling_eyessmiling_face_with_smiling_eyessmiling_face_with_smiling_eyessmiling_face_with_smiling_eyesfolded_handsfolded_handsfolded_hands https://t.co/nxeojVug3z"
expected output
"smiling_face_with_smiling_eyes smiling_face_with_smiling_eyes smiling_face_with_smiling_eyes smiling_face_with_smiling_eyes folded_hands folded_hands folded_hands https://t.co/nxeojVug3z"
So that, we can tokenize the emoji word