Closed ynebula closed 4 years ago
I am studying machine reading comprehension on xlmroberta. My data is korquad.
I need to tokenize all word to character. e.g. by english This is a dog -> _T h i s _i s _a _d o g
please let me know.
Please use --model_type=char to train spm.
--model_type=char
I am studying machine reading comprehension on xlmroberta. My data is korquad.
I need to tokenize all word to character. e.g. by english This is a dog -> _T h i s _i s _a _d o g
please let me know.