Can you tell me please, what am I doing wrong?
Was training model on raw russian text file( > 4mb), formatted as in scotus file:
marfo4ka43: Ты кто вообще?)
kyindarkkk: чел с соседнего офиса с дредами)
marfo4ka43: ахаха
kyindarkkk: xDD
marfo4ka43: мы на обеде до столовки ходим, так себе прогулка
But at the end, when running chatbot.py it only returns spaces, numbers and english characters:
привет алешка
kyindarkkk: 5
погода сегодня так себе
kyindarkkk:
ты думаешь?
kyindarkkk:
кто то просто нас не понял
kyindarkkk: ? 3Fllus s GO 1738
The question is, how to make it train/output russian characters too
By default open() use locale.getpreferredencoding(False). So you need to set encoding explicitly when you open the file in utils:106
io.open(input_file, mode='rt', encoding='utf-8')
Can you tell me please, what am I doing wrong? Was training model on raw russian text file( > 4mb), formatted as in scotus file: