A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
运行python finetune.py 训练会提示:
......
颡', '碛', '互', '奈', '憎', '炉', '蹰', '聂', '员', '呶', '瞬', 'il@@', '恿', '阙', '卡', 'mi@@', '禾', '椴', 'yo@@', '帀', '醵', '帝', '隔', '忒', '哑', '効', '楗', '鼱', '塽', '苴', '蜞', '健', '醅', 'ju@@', '新', '程', '茗', '琰', '几', '揍', '匍', '砣', '禳', '罗', '勿', '擗', '畛', '框', '泒', '析', '沢', '偷', '繁', '嗣', '呵', '念', 'so@@', '溷', '曩', 'spon@@', '狼', '倔', '威', '潭', '踯', '晁', '吩', '袅', '喀', '洌', '炯', '纸', '抽', '簧', 'c@@', '买', '吖', '俬', '梓', '叡', '祼', '烃', '荃', '眀', ''], token_type='char', train_data_file='/nfs/dataset/data/dump/fbank/train/ark_txt.scp', train_data_path_and_name_and_type=[['./checkpoint/data/train/wav.scp', 'speech', 'sound'], ['./checkpoint/data/train/text', 'text', 'text']], train_dtype='float32', train_set='train', train_shape_file=['./checkpoint/data/train/speech_shape'], unused_parameters=True, use_amp=False, use_pai=False, use_preprocessor=True, use_tensorboard=True, use_wandb=False, val_scheduler_criterion=['valid', 'acc'], valid_batch_bins=None, valid_batch_size=None, valid_batch_type=None, valid_data_file='/nfs/dataset/data/dump/fbank/dev/ark_txt.scp', valid_data_path_and_name_and_type=[['./checkpoint/data/validation/wav.scp', 'speech', 'sound'], ['./checkpoint/data/validation/text', 'text', 'text']], valid_max_cache_size=None, valid_shape_file=['./checkpoint/data/validation/speech_shape'], wandb_entity=None, wandb_id=None, wandb_model_log_interval=-1, wandb_name=None, wandb_project=None, write_collected_feats=False)
[e0cbe201d3f0] 2023-07-10 09:00:01,917 (abs_task:1352) INFO: Loading pretrained params from /mnt/workspace/.cache/modelscope/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pb
[e0cbe201d3f0] 2023-07-10 09:00:03,945 (abs_task:1643) INFO: [train] dataset:
ESPnetDataset(
speech: {"path": "./checkpoint/data/train/wav.scp", "type": "sound"}
text: {"path": "./checkpoint/data/train/text", "type": "text"}
preprocess: <funasr.datasets.preprocessor.CommonPreprocessor object at 0x7f0231154090>)
[e0cbe201d3f0] 2023-07-10 09:00:03,945 (abs_task:1644) INFO: [train] Batch sampler: LengthBatchSampler(N-batch=4464, batch_bins=2000, sort_in_batch=descending, sort_batch=descending)
[e0cbe201d3f0] 2023-07-10 09:00:03,946 (abs_task:1646) INFO: [train] mini-batch sizes summary: N-batch=4464, mean=26.9, min=4, max=67
[e0cbe201d3f0] 2023-07-10 09:00:04,505 (abs_task:1643) INFO: [valid] dataset:
ESPnetDataset(
speech: {"path": "./checkpoint/data/validation/wav.scp", "type": "sound"}
text: {"path": "./checkpoint/data/validation/text", "type": "text"}
preprocess: <funasr.datasets.preprocessor.CommonPreprocessor object at 0x7f022c6fd1d0>)
[e0cbe201d3f0] 2023-07-10 09:00:04,505 (abs_task:1644) INFO: [valid] Batch sampler: LengthBatchSampler(N-batch=536, batch_bins=2000, sort_in_batch=descending, sort_batch=descending)
[e0cbe201d3f0] 2023-07-10 09:00:04,505 (abs_task:1646) INFO: [valid] mini-batch sizes summary: N-batch=536, mean=26.7, min=10, max=55
[libprotobuf FATAL google/protobuf/stubs/common.cc:83] This program was compiled against version 3.9.2 of the Protocol Buffer runtime library, which is not compatible with the installed version (3.20.3). Contact the program author for an update. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "bazel-out/k8-opt/bin/tensorflow/core/framework/tensor_shape.pb.cc".)
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): This program was compiled against version 3.9.2 of the Protocol Buffer runtime library, which is not compatible with the installed version (3.20.3). Contact the program author for an update. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "bazel-out/k8-opt/bin/tensorflow/core/framework/tensor_shape.pb.cc".)
已放弃 (核心已转储)
OS:Linux Python Version:3.7 Package Version:pytorch==1.13.1、modelscope==1.7.1、funasr==0.6.5 问题描述: finetune.py文件 `import os
from modelscope.metainfo import Trainers from modelscope.trainers import build_trainer
from funasr.datasets.ms_dataset import MsDataset from funasr.utils.modelscope_param import modelscope_args
def modelscope_finetune(params): if not os.path.exists(params.output_dir): os.makedirs(params.output_dir, exist_ok=True)
dataset split ["train", "validation"]
if name == 'main': params = modelscope_args(model="damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch", data_path="./data") params.output_dir = "./checkpoint" # m模型保存路径 params.data_path = "speech_asr/speech_asr_aishell1_trainsets" # 数据路径 params.dataset_type = "small" # 小数据量设置small,若数据量大于1000小时,请使用large params.batch_bins = 2000 # batch size,如果dataset_type="small",batch_bins单位为fbank特征帧数,如果dataset_type="large",batch_bins单位为毫秒, params.max_epoch = 50 # 最大训练轮数 params.lr = 0.00005 # 设置学习率
运行python finetune.py 训练会提示: ...... 颡', '碛', '互', '奈', '憎', '炉', '蹰', '聂', '员', '呶', '瞬', 'il@@', '恿', '阙', '卡', 'mi@@', '禾', '椴', 'yo@@', '帀', '醵', '帝', '隔', '忒', '哑', '効', '楗', '鼱', '塽', '苴', '蜞', '健', '醅', 'ju@@', '新', '程', '茗', '琰', '几', '揍', '匍', '砣', '禳', '罗', '勿', '擗', '畛', '框', '泒', '析', '沢', '偷', '繁', '嗣', '呵', '念', 'so@@', '溷', '曩', 'spon@@', '狼', '倔', '威', '潭', '踯', '晁', '吩', '袅', '喀', '洌', '炯', '纸', '抽', '簧', 'c@@', '买', '吖', '俬', '梓', '叡', '祼', '烃', '荃', '眀', ''], token_type='char', train_data_file='/nfs/dataset/data/dump/fbank/train/ark_txt.scp', train_data_path_and_name_and_type=[['./checkpoint/data/train/wav.scp', 'speech', 'sound'], ['./checkpoint/data/train/text', 'text', 'text']], train_dtype='float32', train_set='train', train_shape_file=['./checkpoint/data/train/speech_shape'], unused_parameters=True, use_amp=False, use_pai=False, use_preprocessor=True, use_tensorboard=True, use_wandb=False, val_scheduler_criterion=['valid', 'acc'], valid_batch_bins=None, valid_batch_size=None, valid_batch_type=None, valid_data_file='/nfs/dataset/data/dump/fbank/dev/ark_txt.scp', valid_data_path_and_name_and_type=[['./checkpoint/data/validation/wav.scp', 'speech', 'sound'], ['./checkpoint/data/validation/text', 'text', 'text']], valid_max_cache_size=None, valid_shape_file=['./checkpoint/data/validation/speech_shape'], wandb_entity=None, wandb_id=None, wandb_model_log_interval=-1, wandb_name=None, wandb_project=None, write_collected_feats=False)
[e0cbe201d3f0] 2023-07-10 09:00:01,917 (abs_task:1352) INFO: Loading pretrained params from /mnt/workspace/.cache/modelscope/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pb
[e0cbe201d3f0] 2023-07-10 09:00:03,945 (abs_task:1643) INFO: [train] dataset:
ESPnetDataset(
speech: {"path": "./checkpoint/data/train/wav.scp", "type": "sound"}
text: {"path": "./checkpoint/data/train/text", "type": "text"}
preprocess: <funasr.datasets.preprocessor.CommonPreprocessor object at 0x7f0231154090>)
[e0cbe201d3f0] 2023-07-10 09:00:03,945 (abs_task:1644) INFO: [train] Batch sampler: LengthBatchSampler(N-batch=4464, batch_bins=2000, sort_in_batch=descending, sort_batch=descending)
[e0cbe201d3f0] 2023-07-10 09:00:03,946 (abs_task:1646) INFO: [train] mini-batch sizes summary: N-batch=4464, mean=26.9, min=4, max=67
[e0cbe201d3f0] 2023-07-10 09:00:04,505 (abs_task:1643) INFO: [valid] dataset:
ESPnetDataset(
speech: {"path": "./checkpoint/data/validation/wav.scp", "type": "sound"}
text: {"path": "./checkpoint/data/validation/text", "type": "text"}
preprocess: <funasr.datasets.preprocessor.CommonPreprocessor object at 0x7f022c6fd1d0>)
[e0cbe201d3f0] 2023-07-10 09:00:04,505 (abs_task:1644) INFO: [valid] Batch sampler: LengthBatchSampler(N-batch=536, batch_bins=2000, sort_in_batch=descending, sort_batch=descending)
[e0cbe201d3f0] 2023-07-10 09:00:04,505 (abs_task:1646) INFO: [valid] mini-batch sizes summary: N-batch=536, mean=26.7, min=10, max=55
[libprotobuf FATAL google/protobuf/stubs/common.cc:83] This program was compiled against version 3.9.2 of the Protocol Buffer runtime library, which is not compatible with the installed version (3.20.3). Contact the program author for an update. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "bazel-out/k8-opt/bin/tensorflow/core/framework/tensor_shape.pb.cc".)
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): This program was compiled against version 3.9.2 of the Protocol Buffer runtime library, which is not compatible with the installed version (3.20.3). Contact the program author for an update. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "bazel-out/k8-opt/bin/tensorflow/core/framework/tensor_shape.pb.cc".)
已放弃 (核心已转储)