StyleText： UnicodeDecodeError: 'gbk' codec can't decode byte 0xa6 in position 64: illegal multibyte sequence

monkeycc commented 2 years ago

python tools/synth_dataset.py -c configs/dataset_config.yml 默认没有改动

(PaddleOCR) PS E:\PaddleOCR\StyleText> python tools/synth_dataset.py -c configs/dataset_config.yml
[2022/03/26 00:08:15] srnet INFO: load pretrained model from style_text_models/bg_generator
[2022/03/26 00:08:17] srnet INFO: load pretrained model from style_text_models/text_generator
[2022/03/26 00:08:17] srnet INFO: load pretrained model from style_text_models/fusion_generator
[2022/03/26 00:08:17] srnet INFO: using FileCorpus
Traceback (most recent call last):
  File "tools/synth_dataset.py", line 31, in <module>
    synth_dataset()
  File "tools/synth_dataset.py", line 26, in synth_dataset
    dataset_synthesiser = DatasetSynthesiser()
  File "E:\PaddleOCR\StyleText\engine\synthesisers.py", line 58, in __init__
    self.style_sampler = style_samplers.DatasetSampler(self.config)
  File "E:\PaddleOCR\StyleText\engine\style_samplers.py", line 27, in __init__
    label_raw = f.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa6 in position 64: illegal multibyte sequence

pyramid20002000 commented 1 year ago

我也遇到了同样的问题，有人碰到过吗？

digitalboy commented 1 year ago

any one can tell something?

HappyBruce1 commented 1 year ago

我也是这个问题，希望早点解决

HappyBruce1 commented 1 year ago

强制用utf-8可以但是生成的中文有问题

PaddlePaddle / PaddleOCR

StyleText： UnicodeDecodeError: 'gbk' codec can't decode byte 0xa6 in position 64: illegal multibyte sequence #5785