Closed loeeeee closed 2 weeks ago
I confirm all of those commands work in debian 12.
Just some notes (not sure if related to debian):
Just read it depends on https://github.com/arter97/immich-native but it says cuda is not supported. Why in your readme you say it is?
Just read it depends on https://github.com/arter97/immich-native but it says cuda is not supported. Why in your readme you say it is?
So, in the README.md
, I wrote,
This guide is heavily inspired by another guide Immich Native, and the install script & service files are modified from the ones in that repo. KUDO to its author, arter97!
By saying modified from the ones in that repo, I mean I changed something in that script to make it work. You may find the changes in the script. It is just a build flag.
Also, the script does not depend on Immich Native -- using this repo does not require downloading immich-native.
I hope this answers your confusion. 😄
I confirm all of those commands work in debian 12.
Thanks a lot for testing this out.
- The password on runtime.env "DB_PASSWORD" does not seem to be used, as it gives a password error when connecting to postgres if i change the db password.
This is weird. Please open a new issue with a brief description. DB_PASSWORD
should be passed to Immich without any modification by the install or execution script. In other words, it is sort of not controlled by this script.
- gpu does not seem to be used by immich for machine learning as my nvidia-smi shows no usage
There is another very similar issue https://github.com/loeeeee/immich-in-lxc/issues/7 reporting this. Can you check the solution in that out? 😄
thanks for your reply @loeeeee
I will open a new issue for that
Regarding gpu issue, does not seem to be related. I have followed all the steps at your immich config and the log does not show any error regarding any missing lib. I have installed cudnn and cuda toolkit from official nvidia site install instructions, because those packages (nvidia-cudnn libcublaslt12 libcublas12) are not in debian repo.
Here are the ml.log logs https://0x0.st/XyRp.log
thanks for your reply @loeeeee
I will open a new issue for that
Regarding gpu issue, does not seem to be related. I have followed all the steps at your immich config and the log does not show any error regarding any missing lib. I have installed cudnn and cuda toolkit from official nvidia site install instructions, because those packages (nvidia-cudnn libcublaslt12 libcublas12) are not in debian repo.
Here are the ml.log logs https://0x0.st/XyRp.log
Sorry, this is a busy week for me. I will look into this at weekend.
thanks for your reply @loeeeee
I will open a new issue for that
Regarding gpu issue, does not seem to be related. I have followed all the steps at your immich config and the log does not show any error regarding any missing lib. I have installed cudnn and cuda toolkit from official nvidia site install instructions, because those packages (nvidia-cudnn libcublaslt12 libcublas12) are not in debian repo.
Here are the ml.log logs https://0x0.st/XyRp.log
In the error.log
, there is one line that caught my eyes. It says CUDA out of memory, which is a bit weird, as the default model used by Immich will use about 1100 MB of GPU memory during my test. Maybe there is some other process starving this process.
[E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_113' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:123 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char, const char, ERRTYPE, const char, const char, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:116 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char, const char, ERRTYPE, const char, const char, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, common::Status> = void] CUDA failure 2: out of memory ; GPU=0 ; hostname=immich ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_allocator.cc ; line=47 ; expr=cudaMalloc((void**)&p, size);
Other log also suggests some kinds of out of memory issue.
[E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_5' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 195084288
Just some random guess, are there any huge image files in the library? Or do you have a GPU with only very limited memory available?
I find someone with similar issue as yours in Immich github issue page.
I have installed cudnn and cuda toolkit from official nvidia site install instructions, because those packages (nvidia-cudnn libcublaslt12 libcublas12) are not in debian repo.
Good choice!
@loeeeee i have seen that but that's old message, when i restart it it's not repeating. My gpu has 12gb memory it's an rtx 3060. Btw I am not totally sure that it doesen't use GPU... I just guess that since when i keep printing nvidia-smi, even when reproducing video the memory usage does not change. I'm not sure if there is a better way to be sure it's being used.
root@immich:/home/immich/immich-in-lxc# nvidia-smi
Tue Sep 3 12:05:38 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.78 Driver Version: 550.78 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 Off | 00000000:21:00.0 Off | N/A |
| 0% 57C P2 37W / 170W | 1380MiB / 12288MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
root@immich:/home/immich/immich-in-lxc#
@loeeeee i have seen that but that's old message, when i restart it it's not repeating. My gpu has 12gb memory it's an rtx 3060. Btw I am not totally sure that it doesen't use GPU... I just guess that since when i keep printing nvidia-smi, even when reproducing video the memory usage does not change. I'm not sure if there is a better way to be sure it's being used.
HAHAHAHA. You fell into the same pitfall as I did!!!!!! 🤣
You can see the GPU usage from the console of Proxomox host, but not inside the LXC. I have this issue as well. It is using the GPU, because otherwise, it would print something saying no process.
@loeeeee I just checked from host and saw 2 process running but they are definetely not related to immich because when i stop immich services they are still there. They are probably related to others containers.
root@monster:~# nvidia-smi
Tue Sep 3 14:55:43 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.78 Driver Version: 550.78 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 Off | 00000000:21:00.0 Off | N/A |
| 0% 52C P2 37W / 170W | 1380MiB / 12288MiB | 3% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 7946 C /usr/local/bin/python3 206MiB |
| 0 N/A N/A 170797 C python3.8 1166MiB |
+-----------------------------------------------------------------------------------------+
Do you see any process in the host related to immich ? Maybe it just doesn't show
Do you see any process in the host related to immich ? Maybe it just doesn't show
Yes, I do see the process. @makovez
Tue Sep 3 21:37:40 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA RTX A2000 12GB On | 00000000:81:00.0 Off | Off |
| 30% 51C P2 26W / 70W | 566MiB / 12282MiB | 6% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1380055 C ...pp/machine-learning/venv/bin/python 560MiB |
+-----------------------------------------------------------------------------------------+
This is after I rerun the smart search on the Debian test machine.
Currently, I am looking into using TensorRT instead of CUDA as the ONNX runtime backend to see if that would help.
Ok so I just seen that when trascoding the process was not running, then it started to show up when I was using the smart search. But when transcoding i dont see changes in memory usage
Ok so I just seen that when trascoding the process was not running, then it started to show up when I was using the smart search. But when transcoding i dont see changes in memory usage
You can manually force a redo of machine-learning in Job > Smart Search > All. Then it should run for a while. However, I assume the out of memory error still exists, which will crush the program very fast. @makovez
wdym ? Smart search is working and i can see it uses GPU. What I am saying is that transcoding when playing video does not seem to use gpu @loeeeee
wdym ? Smart search is working and i can see it uses GPU. What I am saying is that transcoding when playing video does not seem to use gpu @loeeeee
ohhhh. I did not get you. @makovez
Transcoded video is cached, no on-the-fly transcoding happens at Immich as far as I know. In Immich, if a video is not transcoded, it does not seem to be playable.
Also, just in case you missed, the besides installing the Jellyfin ffmpeg, there is also a setting needs to be changed to enable HW-accelerated transcoding.
Additionally, for LXC with CUDA support enabled, one needs to go to Administration > Settings > Video Transcoding Settings > Hardware Acceleration > Acceleration API and select NVENC to explicitly use the GPU to do the transcoding.
are u sure is cached ? in /admin/jobs-status there is a process to transcode all videos for compatibility with more devices but i havent run it. Btw they just released a new version 1.112.1 lol so fast
are u sure is cached ? in /admin/jobs-status there is a process to transcode all videos for compatibility with more devices but i havent run it. Btw they just released a new version 1.112.1 lol so fast
Yea, pretty sure. Those jobs are timed and automated. No need to run them manually. It starts itself when new video is uploaded. Or one could run it manually to redo the trancoding.
I am testing out the new version. Immich devs are working really hard.
Seems like most Debian
issues are sorted out, closing the issue.
Currently, the
README.md
is mostly targeted atUbuntu
. Thus, some instruction may not be applicable toDebian
. Though, no breaking issues seem to be present, more tests onDebian
needs to be done to iron out some frustrating details.