yeyupiaoling / MASR

Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
Apache License 2.0
572 stars 100 forks source link

培训模型问题 #15

Closed zhaojunliing closed 2 years ago

zhaojunliing commented 3 years ago

这边使用train.py进行培训,但是不知道dataset下面文件的格式,可以给我传一份吗?

yeyupiaoling commented 3 years ago

@zhaojunliing 看文档,有详细说明。

zhaojunliing commented 3 years ago

root@fh-ai:/workspace/share/masr_yeyupiaoling# python train.py ----------- Configuration Arguments ----------- batch_size: 64 dev_manifest_path: dataset/manifest.dev epochs: 200 learning_rate: 0.6 restore_model: None save_model_path: save_model/ train_manifest_path: dataset/manifest.train vocab_path: dataset/zh_vocab.json

Traceback (most recent call last): File "train.py", line 180, in main() File "train.py", line 176, in main learning_rate=args.learning_rate) File "train.py", line 102, in train loss = ctcloss(out, y, out_lens, y_lens) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.6/dist-packages/warpctc_pytorch-0.1-py3.6-linux-x86_64.egg/warpctc_pytorch/init.py", line 82, in forward self.length_average, self.blank) File "/usr/local/lib/python3.6/dist-packages/warpctc_pytorch-0.1-py3.6-linux-x86_64.egg/warpctc_pytorch/init.py", line 21, in forward loss_func = warp_ctc.gpu_ctc if is_cuda else warp_ctc.cpu_ctc AttributeError: module 'warpctc_pytorch' has no attribute 'gpu_ctc'

群主可以帮忙看下吗,一直卡到这里了

yeyupiaoling commented 3 years ago

@zhaojunliing 你编译ctc了吗?

zhaojunliing commented 3 years ago

我在容器里面编译的,一直报错

-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found suitable version "10.1", minimum required is "6.5") -- cuda found TRUE -- Building shared library with GPU support CMake Error: The following variables are used in this project, but they are set to NOTFOUND. Please set them or make sure they are set and tested correctly in the CMake files: CUDA_curand_LIBRARY (ADVANCED) linked by target "test_gpu" in directory /workspace/share/warp-ctc

-- Configuring incomplete, errors occurred! See also "/workspace/share/warp-ctc/build/CMakeFiles/CMakeOutput.log". See also "/workspace/share/warp-ctc/build/CMakeFiles/CMakeError.log".

错误信息 Determining if the function pthread_create exists in the pthreads failed with the following output: Change Dir: /workspace/share/warp-ctc/build/CMakeFiles/CMakeTmp

Run Build Command:"/usr/bin/make" "cmTC_b5bb1/fast" /usr/bin/make -f CMakeFiles/cmTC_b5bb1.dir/build.make CMakeFiles/cmTC_b5bb1.dir/build make[1]: Entering directory '/workspace/share/warp-ctc/build/CMakeFiles/CMakeTmp' Building C object CMakeFiles/cmTC_b5bb1.dir/CheckFunctionExists.c.o /usr/bin/cc -fPIC -DCHECK_FUNCTION_EXISTS=pthread_create -o CMakeFiles/cmTC_b5bb1.dir/CheckFunctionExists.c.o -c /usr/share/cmake-3.10/Modules/CheckFunctionExists.c Linking C executable cmTC_b5bb1 /usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_b5bb1.dir/link.txt --verbose=1 /usr/bin/cc -fPIC -DCHECK_FUNCTION_EXISTS=pthread_create -rdynamic CMakeFiles/cmTC_b5bb1.dir/CheckFunctionExists.c.o -o cmTC_b5bb1 -lpthreads /usr/bin/ld: cannot find -lpthreads collect2: error: ld returned 1 exit status CMakeFiles/cmTC_b5bb1.dir/build.make:97: recipe for target 'cmTC_b5bb1' failed make[1]: [cmTC_b5bb1] Error 1 make[1]: Leaving directory '/workspace/share/warp-ctc/build/CMakeFiles/CMakeTmp' Makefile:126: recipe for target 'cmTC_b5bb1/fast' failed make: [cmTC_b5bb1/fast] Error 2

yeyupiaoling commented 3 years ago

@zhaojunliing 你有没有安装好CUDA并配置环境变量了,这个看不出是什么错误。make的版本是什么

zhaojunliing commented 3 years ago

export CUDA_HOME=/usr/local/cuda-10.1/ export PATH=/usr/local/cuda-10.1/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH export WARP_CTC_PATH="/workspace/share/warp-ctc/build"

root@fh-ai:/workspace/share/warp-ctc/build# cmake -version cmake version 3.10.2

CMake suite maintained and supported by Kitware (kitware.com/cmake).

root@fh-ai:/workspace/share/warp-ctc/build# make -version GNU Make 4.1 Built for x86_64-pc-linux-gnu Copyright (C) 1988-2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.

zhaojunliing commented 3 years ago

网上看的,gcc版本都降到4.8了 sudo apt install gcc-4.8 g++-4.8 cd /usr/bin sudo mv gcc gcc_bak sudo mv g++ g++_bak sudo ln -s gcc-4.8 gcc sudo ln -s g++-4.8 g++

yeyupiaoling commented 3 years ago

有没有严格按照这个执行?

image

export WARP_CTC_PATH="/workspace/share/warp-ctc/build"

为啥要配置这个目录

zhaojunliing commented 3 years ago

解决了,重新下载了个docker镜像 pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel 在这个环境下可以完成环境的安装

zhaojunliing commented 3 years ago

1it [00:00, 12.17it/s] WARNING:root:NaN or Inf found in input tensor. Epoch 199: Loss= inf, CER = 4.714285714285714 WARNING:root:NaN or Inf found in input tensor. [200/200][0/1] Loss = inf Remain time: 0:00:00 decoding... 1it [00:00, 15.32it/s] WARNING:root:NaN or Inf found in input tensor. Epoch 200: Loss= inf, CER = 4.642857142857143

输出NaN inf是什么情况?

yeyupiaoling commented 3 years ago

没关系的,继续训练就没有了

zhaojunliing commented 3 years ago

你好,我这边有个需求是这样的,我自己录制多条一样的语音,然后想按照这样说话的时候识别出来,但是培训的时候前面loss下降了,后面确越来越大。

yeyupiaoling commented 3 years ago

就几条数据吗?是不是数据太少了。

zhaojunliing commented 3 years ago

我们要求要准确的识别出来这几条的,所以想单独培训这几条

yeyupiaoling commented 3 years ago

具体是几条,有没有打印batch size。可以多找几个人录同样的

zhaojunliing commented 3 years ago

我这边测试的时候录制了5条音频,音频是一样的 batch_size设置的是4

/workspace/share/masr_yeyupiaoling/dataset/audio/002.wav 科技股份有限公司 /workspace/share/masr_yeyupiaoling/dataset/audio/003.wav 科技股份有限公司 /workspace/share/masr_yeyupiaoling/dataset/audio/004.wav 科技股份有限公司 /workspace/share/masr_yeyupiaoling/dataset/audio/005.wav 科技股份有限公司 /workspace/share/masr_yeyupiaoling/dataset/audio/006.wav 科技股份有限公司

yeyupiaoling commented 3 years ago

数据太少了,你还不如做唤醒词算了

zhaojunliing commented 3 years ago

好的,谢谢指导

zhaojunliing commented 3 years ago

如果我把网上的3个数据集下载下来,然后再加上自己录制的语音,当出现我自己的音频的时候,识别率应该会增高吧

yeyupiaoling commented 3 years ago

嗯嗯

zhaojunliing commented 3 years ago

image 1、培训到100多的时候显示的是这样的,有问题吗?

2、这个项目有没有什么工具可以显示培训的模型识别率到达最高的时候是什么时候?

3、我看公布的 超大数据集(超过1300小时),我能在这个模型的基础上训练吗?

4、 --model_path,--lm_path 训练模型和语言模型

在使用infer_server的时候,这两个模型都需要加载吗?两者有什么关系吗?

yeyupiaoling commented 3 years ago

@zhaojunliing 1、出现inf可以不用管,继续训练就好 2、我这个没有用可视化,我这个项目有可视化:https://github.com/yeyupiaoling/PaddlePaddle-DeepSpeech 3、可以的,注意设置参数 4、一个是网络模型,另一个是解码的语言模型,只有用到定向搜索时候才会用到语言模型

wl-junlin commented 3 years ago

@zhaojunliing 1、出现inf可以不用管,继续训练就好 2、我这个没有用可视化,我这个项目有可视化:https://github.com/yeyupiaoling/PaddlePaddle-DeepSpeech 3、可以的,注意设置参数 4、一个是网络模型,另一个是解码的语言模型,只有用到定向搜索时候才会用到语言模型

那个我可以请问一下为什么 inf不用管吗。。。 我刚开始训练有时候会出现inf,有时候loss就是从6~1k多持续波动,都不得已暂停了 然后现在加载了你给的模型的参数后才规避了这个问题

yeyupiaoling commented 3 years ago

@wl-junlin 这种情况下的话,可以适当减少学习率。