Open lai-serena opened 4 months ago
Hi @lai-serena, thanks for your feedback.
It looks like the build was unsuccessful. You can try installing it using the following method:
pip uninstall minference -y
MINFERENCE_FORCE_BUILD=TRUE pip install minference --no-cache-dir
or build from source,
git clone https://github.com/microsoft/MInference/
cd MInference
pip install -e .
pip install -e .
works for me. Thanks!
Describe the issue
I encountered some issues when using minference in Python.
import minference
The problem isTraceback (most recent call last): File "<stdin>", line 1, in <module> File "/workspace/MInference/minference/__init__.py", line 8, in <module> from .models_patch import MInference File "/workspace/MInference/minference/models_patch.py", line 7, in <module> from .patch import minference_patch, minference_patch_vllm, patch_hf File "/workspace/MInference/minference/patch.py", line 12, in <module> from .modules.minference_forward import ( File "/workspace/MInference/minference/modules/minference_forward.py", line 20, in <module> from ..ops.pit_sparse_flash_attention_v2 import vertical_slash_sparse_attention File "/workspace/MInference/minference/ops/pit_sparse_flash_attention_v2.py", line 10, in <module> from ..cuda import convert_vertical_slash_indexes ModuleNotFoundError: No module named 'minference.cuda'
enviornment : Python 3.10.14 minference 0.1.4.post3 triton 2.1.0 torch 2.3.0 CUDA 11.8 vllm 0.4.2+cu118 flash-attn 2.5.8