TencentGameMate / chinese_speech_pretrain

chinese speech pretrained models
997 stars 84 forks source link

Error #48

Closed ChengsongLu closed 6 months ago

ChengsongLu commented 6 months ago

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte

When call feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained("hubert-large/chinese-hubert-large.pt")

ChengsongLu commented 6 months ago

Why the hubert script use the Wav2Vec2FeatureExtractor as well

ChengsongLu commented 6 months ago

What is the different between 'chinese-hubert-large-fairseq-ckpt.pt' and 'pytorch_model.bin' in chinese-hubert-large? Which model should I use?

LiuShixing commented 6 months ago

带fairseq的是fairseq框架的模型,另一个是transformer框架,看你用的是哪个框架

发自我的iPhone

------------------ Original ------------------ From: ChengsongLu @.> Date: Thu,Feb 22,2024 6:49 PM To: TencentGameMate/chinese_speech_pretrain @.> Cc: Subscribed @.***> Subject: Re: [TencentGameMate/chinese_speech_pretrain] Error (Issue #48)