Closed gopesh97 closed 3 years ago
I am facing similar issues using unicode, can you share some example lines from your manifest? One possibility is that you need to use ensure_ascii=False
as an option to json.dump when generating the manifests: json.dump(metadata, f, ensure_ascii=False)
@gopesh97 have you got any success on Hindi. I'm working on urdu , just curious to know whether this Nemo model works for right to left language?
Ensure that you have the correct language settings in your environment (native or in container). Most ascii encoding issues can be solved by setting the LANG
and LC_ALL
env variables to a UTF-8 default
I am using pre-trained quartznet 15x5, for transfer learning for the Hindi language, with a different set of vocab (Devanagari characters.)
While training I am facing mainly 2 issues:
quartznet.save_to('path/to/save')
, I am getting'ascii' codec can't encode character '\u091b' in position 4407: ordinal not in range(128)
Please provide a solution to overcome the abovementioned issues.
Environment overview
Additional context GPU model - 4 x RTX 2080Ti , 12 GB vRAM.