gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
https://arxiv.org/abs/2408.16725
MIT License
2.41k stars 240 forks source link

It seems doesn't work on Windows... #28

Open SmerchProsto opened 1 week ago

SmerchProsto commented 1 week ago

My environment: NVIDIA GeForce GTX 1650 Windows 11 Conda

I did all steps by instruction on git, i installed all dep's by requirements. But it seems doensn't work...

mini-omni commented 1 week ago

hi, what is the problem? you may give more details, so that someone with WinOS env might help.

SmerchProsto commented 1 week ago

Look, I activated an venv by Conda, install all deps by requirements file, so I have a problem with Pytorch. I installed it from official site with CUDA ( I tried cuda 11.8 and 12.1) with GPU settings Nvidia (conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia). To consume, after many attempts, I have an error like OSError: [WinError 127]. Error loading "{path}\envs\omni\lib\site-packages\torch\lib\nvfuser_codegen.dll" or one of its dependencies..

SmerchProsto commented 1 week ago

It seems like the project can be easily launched on Linux according to the instructions from GIT?

kunci115 commented 1 week ago

use virtualenv instead of conda

mini-omni commented 1 week ago

It seems like the project can be easily launched on Linux according to the instructions from GIT?

yes, since I do not have WinOS testing enviroment, I only tested on Linux. Hope other WinOS user might help.

808cn commented 5 days ago

Windows下运行,是可以的,说一下我的做法。

我没有用 conda,但也安装成功了。我用了 venv 的管理办法,具体使用如下: 当前python切换为python10 (作者用的是python10,和作者保持一致即可) -------------------------------------- venv虚拟环境用法 ---------------------------------- 1.创建虚拟环境:在终端中输入以下命令,创建一个名为 venv 的虚拟环境: python -m venv venv

2.激活虚拟环境:在 Windows 上,可以使用以下命令激活虚拟环境: venv\Scripts\activate.bat

linux下: source venv/bin/activate

激活虚拟环境后,终端的提示符会显示虚拟环境名称,表示当前环境已切换到虚拟环境中。 python.exe -m pip install --upgrade pip

3.在虚拟环境中安装第三方库:在激活虚拟环境后,可以使用 pip 命令安装需要的第三方库,例如: pip install autogenstudio 这样可以在虚拟环境中安装 其它需要的库。

4.退出虚拟环境。 deactivate

---------------------------------------- venv虚拟环境用法 end ----------------------------------------

有几点要注意,我遇到的问题如下:

1.当前python切换为python10, 最新的python11 torch 不一定支持。

先安装官方的库:pip install -r requirements.txt 我没有用 conda,但也安装成功了。

2.pytorch报错了。 torch报错AssertionError: Torch not compiled with CUDA enabled 参考如下URL,解决: https://pytorch.org/get-started/previous-versions/

最后用:Wheel -> Linux and Windows -> CUDA 12.1 的方式解决。 pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121

下载国内速度太慢,我是下载以后。用下面的命令安装的。 pip install "torch-2.3.1+cu121-cp310-cp310-win_amd64.whl"

3.CUDA_11.8 我安装在D盘了。 把这两个的路径添加到环境变量中。D:\CUDA_11.8\bin 和 D:\CUDA_11.8\lib\x64

4.如果你没有显卡。代码要稍作修改: 在inference.py及其他一些文件中,将代码中的 'cuda:0' 改为 'cpu'。

5.进入python环境, 逐行粘贴代码测试, 运行没有问题的话,torch 基本没问题了。

C:\Users\asus>python

import torch # 导入torch print(torch.cuda.is_available()) #cuda是否可用 print(torch.cuda.get_device_name(0)) #返回设备索引 print(torch.cuda.device_count()) # 返回GPU的数量 print(torch.cuda.current_device()) # 返回当前设备索引 print(torch.rand(3,3).cuda())

最后,希望作者能早日支持中文的语音输出。

SmerchProsto commented 5 days ago

謝謝,我會嘗試

mini-omni commented 3 days ago

similar issue: https://github.com/gpt-omni/mini-omni/issues/35

lowpair commented 2 days ago

希望作者能早日支持中文的语音输出。