huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.09k stars 27.03k forks source link

How to tokenize word to characeter #3220

Closed ynebula closed 4 years ago

ynebula commented 4 years ago

I am studying machine reading comprehension on xlmroberta. My data is korquad.

I need to tokenize all word to character. e.g. by english This is a dog -> _T h i s _i s _a _d o g

please let me know.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.