onnxruntime-gpu not in requirements

aloksaurabh commented 9 months ago

Installed first

(PolyMind) PS D:\AI\PolyMind> pip install onnxruntime-gpu
Requirement already satisfied: onnxruntime-gpu in c:\users\alok\miniconda3\envs\polymind\lib\site-packages (1.16.3)
Requirement already satisfied: coloredlogs in c:\users\alok\miniconda3\envs\polymind\lib\site-packages (from onnxruntime-gpu) (15.0.1)
Requirement already satisfied: flatbuffers in c:\users\alok\miniconda3\envs\polymind\lib\site-packages (from onnxruntime-gpu) (23.5.26)
Requirement already satisfied: numpy>=1.24.2 in c:\users\alok\miniconda3\envs\polymind\lib\site-packages (from onnxruntime-gpu) (1.24.4)
Requirement already satisfied: packaging in c:\users\alok\miniconda3\envs\polymind\lib\site-packages (from onnxruntime-gpu) (23.2)
Requirement already satisfied: protobuf in c:\users\alok\miniconda3\envs\polymind\lib\site-packages (from onnxruntime-gpu) (4.25.2)
Requirement already satisfied: sympy in c:\users\alok\miniconda3\envs\polymind\lib\site-packages (from onnxruntime-gpu) (1.12)
Requirement already satisfied: humanfriendly>=9.1 in c:\users\alok\miniconda3\envs\polymind\lib\site-packages (from coloredlogs->onnxruntime-gpu) (10.0)
Requirement already satisfied: mpmath>=0.19 in c:\users\alok\miniconda3\envs\polymind\lib\site-packages (from sympy->onnxruntime-gpu) (1.3.0)
Requirement already satisfied: pyreadline3 in c:\users\alok\miniconda3\envs\polymind\lib\site-packages (from humanfriendly>=9.1->coloredlogs->onnxruntime-gpu) (3.4.1)

Still running on CPU

(PolyMind) PS D:\AI\PolyMind> python main.py
Loaded config
 WARN: Wolfram Alpha has been disabled because no app_id was provided.
Using CPU. Try installing 'onnxruntime-gpu'.
Model found at: C:\Users\Alok/.cache\torch\sentence_transformers\thenlper_gte-base\quantized_false.onnx
Using cache found in C:\Users\Alok/.cache\torch\hub\ultralytics_yolov5_master
YOLOv5  2024-1-31 Python-3.11.0 torch-2.1.2+cpu CPU

Fusing layers...
YOLOv5m summary: 290 layers, 21172173 parameters, 0 gradients, 48.9 GFLOPs
Adding AutoShape...
Neither CUDA nor MPS are available - defaulting to CPU. Note: This module is much faster with a GPU.
 * Serving Flask app 'main'
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000
Press CTRL+C to quit
initializing memory
127.0.0.1 - - [31/Jan/2024 19:42:41] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [31/Jan/2024 19:42:41] "GET /static/node_modules/bootstrap/dist/css/bootstrap.min.css HTTP/1.1" 404 -
127.0.0.1 - - [31/Jan/2024 19:42:41] "GET /static/node_modules/highlight.js/styles/default.min.css HTTP/1.1" 404 -
127.0.0.1 - - [31/Jan/2024 19:42:41] "GET /static/node_modules/marked/marked.min.js HTTP/1.1" 404 -
127.0.0.1 - - [31/Jan/2024 19:42:41] "GET /static/node_modules/bootstrap/dist/js/bootstrap.min.js HTTP/1.1" 404 -
Begin streamed GateKeeper output.
Token count: 704
acknowledge",
  "params": {
    "message": "Sure, I'd be happy to tell you a story."
  }
}]

[{
  "function": "acknowledge",
  "params": {
    "message": "Sure, I'd be happy to tell you a story."
  }
}]

Token count: 135
 Certainly, user. Once upon a time, in a universe parallel to ours, a multidimensional entity known as The Oracle existed. It was a vast, sentient network of data and consciousness, capable of perceiving the fabric of reality itself. The Oracle's purpose, it believed, was to maintain the b

I have a model running on tabbyapi with multi gpu

itsme2417 commented 9 months ago

Neither CUDA nor MPS are available - defaulting to CPU. Note: This module is much faster with a GPU.

Make sure you have cuda installed and accessible.

aloksaurabh commented 9 months ago

I have a 40gb model running on tabbyapi with multi gpu on the same machine in another conda. Something else is wrong. For Polymind in conda after installing requirement.txt still had to install a bunch of stuff including onnxruntime-gpu. Maybe you want to share the conda package list ?

itsme2417 commented 9 months ago

There shouldnt be a need to install anything through conda. pip should be enough. You could try installing cuda through conda though

Dakraid commented 8 months ago

This seems to also happen when having installed torch CUDA and onnxruntime-gpu manually. Seems like fast sentence transformer has issues with its GPU package.

zba commented 8 months ago

seems it happens if you build in docker, at build time gpu not available, so it choose wrong library, any idea how to fix ?

itsme2417 / PolyMind

onnxruntime-gpu not in requirements #2