HaujetZhao / CapsWriter-Offline

CapsWriter 的离线版,一个好用的 PC 端的语音输入工具
2.43k stars 190 forks source link

ubuntu 20.04 x64 阿里云ecs运行core_server.py无限卡在“标点模型载入中” #89

Closed gongqf closed 4 months ago

gongqf commented 4 months ago

Python=3.8,依赖确认均安装到位,models目录下的两份模型文件也具备,无英伟达显卡,无CUDA 1核CPU 2GB内存 ubuntu 20.04 x64 安装过程: 1、安装venv环境,开始安装服务器依赖环境提示jieba出错 Building wheels for collected packages: jieba Building wheel for jieba (setup.py) ... error ERROR: Command errored out with exit status 1: command: /home/venv/CapsWriter-Offline/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-q1hxpgpy/jieba/setup.py'"'"'; file='"'"'/tmp/pip-install-q1hxpgpy/jieba/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-ihcffozs cwd: /tmp/pip-install-q1hxpgpy/jieba/ Complete output (6 lines): usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...] or: setup.py --help [cmd1 cmd2 ...] or: setup.py --help-commands or: setup.py cmd --help

error: invalid command 'bdist_wheel'

ERROR: Failed building wheel for jieba Running setup.py clean for jieba Failed to build jieba

但后面又有Running setup.py install for jieba ... done

查问题说wheel没有装,单独安装了后再次安装jieba发现已安装完成 (CapsWriter-Offline) root@iZbp16btrk4y6f6i4awzx4Z:/home/CapsWriter-Offline# pip install jieba Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/ Requirement already satisfied: jieba in /home/venv/CapsWriter-Offline/lib/python3.8/site-packages (0.42.1)

2、然后运行core_server.py提示没有安装torch

─────────────────────────────────── CapsWriter Offline Server ───────────────────────────────────

项目地址:https://github.com/HaujetZhao/CapsWriter-Offline

当前基文件夹:CapsWriter-Offline

绑定的服务地址:0.0.0.0:6016

Process Process-1: Traceback (most recent call last): File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/CapsWriter-Offline/util/server_init_recognizer.py", line 29, in init_recognizer from funasr_onnx import CT_Transformer File "/home/venv/CapsWriter-Offline/lib/python3.8/site-packages/funasr_onnx/init.py", line 2, in from .paraformer_bin import Paraformer, ContextualParaformer File "/home/venv/CapsWriter-Offline/lib/python3.8/site-packages/funasr_onnx/paraformer_bin.py", line 10, in import torch ModuleNotFoundError: No module named 'torch'

^C 再见!

3、手动安装torch (CapsWriter-Offline) root@iZbp16btrk4y6f6i4awzx4Z:/home# pip install torch

4、再次运行core_server.py,提示np.bool已废弃

─────────────────────────────────── CapsWriter Offline Server ───────────────────────────────────

项目地址:https://github.com/HaujetZhao/CapsWriter-Offline

当前基文件夹:CapsWriter-Offline

绑定的服务地址:0.0.0.0:6016

/home/venv/CapsWriter-Offline/lib/python3.8/site-packages/funasr_onnx/punc_bin.py:279: FutureWarning: In the future np.bool will be defined as the corresponding NumPy scalar. def vad_mask(self, size, vad_pos, dtype=np.bool): Process Process-1: Traceback (most recent call last): File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/CapsWriter-Offline/util/server_init_recognizer.py", line 29, in init_recognizer from funasr_onnx import CT_Transformer File "/home/venv/CapsWriter-Offline/lib/python3.8/site-packages/funasr_onnx/init.py", line 5, in from .punc_bin import CT_Transformer File "/home/venv/CapsWriter-Offline/lib/python3.8/site-packages/funasr_onnx/punc_bin.py", line 166, in class CT_Transformer_VadRealtime(CT_Transformer): File "/home/venv/CapsWriter-Offline/lib/python3.8/site-packages/funasr_onnx/punc_bin.py", line 279, in CT_Transformer_VadRealtime def vad_mask(self, size, vad_pos, dtype=np.bool): File "/home/venv/CapsWriter-Offline/lib/python3.8/site-packages/numpy/init.py", line 305, in getattr raise AttributeError(__former_attrs_[attr]) AttributeError: module 'numpy' has no attribute 'bool'. np.bool was a deprecated alias for the builtin bool. To avoid this error in existing code, use bool by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool` here. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations ^C 再见!

5、然后降级numpy到1.23.5 (CapsWriter-Offline) root@iZbp16btrk4y6f6i4awzx4Z:/home# pip uninstall numpy Found existing installation: numpy 1.24.4 Uninstalling numpy-1.24.4: Would remove: /home/venv/CapsWriter-Offline/bin/f2py /home/venv/CapsWriter-Offline/bin/f2py3 /home/venv/CapsWriter-Offline/bin/f2py3.8 /home/venv/CapsWriter-Offline/lib/python3.8/site-packages/numpy-1.24.4.dist-info/ /home/venv/CapsWriter-Offline/lib/python3.8/site-packages/numpy.libs/libgfortran-040039e1.so.5.0.0 /home/venv/CapsWriter-Offline/lib/python3.8/site-packages/numpy.libs/libopenblas64_p-r0-15028c96.3.21.so /home/venv/CapsWriter-Offline/lib/python3.8/site-packages/numpy.libs/libquadmath-96973f99.so.0.0.0 /home/venv/CapsWriter-Offline/lib/python3.8/site-packages/numpy/ Proceed (y/n)? y Successfully uninstalled numpy-1.24.4 (CapsWriter-Offline) root@iZbp16btrk4y6f6i4awzx4Z:/home# pip install numpy==1.23.5

6、再次运行core_server.py,就一直卡在“标点模型载入中”状态,内存在载入模型过程中曾经占用巨大,后迅速释放,还有1g多空余 ─────────────────────────────────── CapsWriter Offline Server ───────────────────────────────────

项目地址:https://github.com/HaujetZhao/CapsWriter-Offline

当前基文件夹:CapsWriter-Offline

绑定的服务地址:0.0.0.0:6016

模块加载完成

语音模型载入完成

标点模型载入中 ^C 再见!

7、升级 sherpa-onnx 版本到1.9.10 无效 (CapsWriter-Offline) root@iZbp16btrk4y6f6i4awzx4Z:/home# pip install -U sherpa-onnx Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/ Collecting sherpa-onnx Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/5c/46/9d62e2d21e6975f5eb927e1acd94201da14f0247006844d788268a4109b2/sherpa_onnx-1.9.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.5 MB) |████████████████████████████████| 18.5 MB 1.8 MB/s Installing collected packages: sherpa-onnx Attempting uninstall: sherpa-onnx Found existing installation: sherpa-onnx 1.8.11 Uninstalling sherpa-onnx-1.8.11: Successfully uninstalled sherpa-onnx-1.8.11 Successfully installed sherpa-onnx-1.9.10

请教如何解决?是ecs的内存太小了吗?

yzy613 commented 4 months ago

引用自 README.md

服务端载入模型需要系统内存 4G,只能在 64 位系统上使用

gongqf commented 4 months ago

虚拟内存不能算吗?一定要物理内存4G?

yzy613 commented 4 months ago

虚拟内存不能算吗?一定要物理内存4G?

我的建议是,在自己的 PC 上开一个虚拟机,测试至少需要多少物理内存才能正常加载模型

gongqf commented 4 months ago

添加了4G swap 终于加载成功了,但是标点实际无效问题和#94反映的一样

语音模型载入完成

标点模型载入完成

模型加载耗时 121.20s

top - 16:38:17 up 692 days, 17:58, 2 users, load average: 0.38, 0.75, 0.38 Tasks: 95 total, 1 running, 94 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.3 us, 0.0 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 1987.4 total, 293.8 free, 1467.4 used, 226.2 buff/cache MiB Swap: 4096.0 total, 2603.0 free, 1493.0 used. 365.7 avail Mem

gongqf commented 4 months ago

尴尬了,原来没有标点是zip的服务端配置文件,标点这项是false,而windows版本是true format_punc = False # 输出时是否启用标点符号引擎(在 MacOS 上标点引擎似乎有问题,应当改为 False)