Closed CarbonTrinity closed 9 months ago
Thanks for reporting this. What command are you using to run compose? It looks like an nvidia runtime issue so it's not seeing the cards within the docker containers. I updated the install steps on the Wiki, did you reinstall docker/compose/nvidia runtime?
While the containers are up, you can run the following to check the crackq container is able to see the GPU cards:
sudo docker exec -it crackq nvidia-smi
sudo docker exec -it crackq /opt/crackq/build/pyhashcat/pyhashcat/hashcat/hashcat -b -m 1000
Yes i went through all the steps again, just to double check i didnt miss something obvious.Ive rebuilt and rerun again but it still seems to be failing. I'm still new with docker, thus the issue is probably my own fault. I've added the steps i tested today [Error message is at step 7]:
1) Reinstall Docker following Install Docker CE on Ubuntu
https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository
2) Reinstall Ubuntu Nvidia Runtime Install
-> No errors here
3) Install Docker OS
sudo ./install.sh docker/nvidia/ubuntu
4) Change secret key
python3 -c 'import secrets; print(secrets.token_urlsafe())'
5) Relaunch Compose
sudo docker compose -f docker-compose.nvidia.yml up --build
Note: [previously used sudo docker-compose -f docker-compose.nvidia.yml up --build as per wiki]
6) Check Nvidia-smi
sudo docker exec -it crackq nvidia-smi
Wed Jan 17 23:21:08 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.223.02 Driver Version: 470.223.02 CUDA Version: 12.3 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| 50% 33C P8 14W / 250W | 14MiB / 4043MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:02:00.0 Off | N/A |
| 51% 30C P8 14W / 250W | 6MiB / 4043MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... Off | 00000000:0B:00.0 Off | N/A |
| 50% 26C P8 13W / 250W | 6MiB / 4043MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
root@localhost-new:/opt/crackq-0.1.2# sudo docker exec -it crackq /opt/crackq/build/pyhashcat/pyhashcat/hashcat/hashcat -b -m 1000
hashcat (v6.2.1) starting in benchmark mode...
Benchmarking uses hand-optimized kernel code by default. You can use it in your cracking session by setting the -O option. Note: Using optimized kernel code limits the maximum supported password length. To disable the optimized kernel code in benchmark mode, use the -w option.
cuInit(): forward compatibility was attempted on non supported HW
clGetPlatformIDs(): CL_PLATFORM_NOT_FOUND_KHR
ATTENTION! No OpenCL-compatible or CUDA-compatible platform found.
You are probably missing the OpenCL or CUDA runtime installation.
Started: Wed Jan 17 23:22:40 2024 Stopped: Wed Jan 17 23:22:40 2024
All the containers are still up though and on the previous command i can see both CUDA and Nvidia detected.
I also tried updating the drivers:
root@localhost-new:/opt/crackq-0.1.2# sudo docker exec -it crackq nvidia-smi Thu Jan 18 00:16:31 2024 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.3 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A | | 50% 32C P8 14W / 250W | 15MiB / 4096MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA GeForce ... Off | 00000000:02:00.0 Off | N/A | | 51% 29C P8 14W / 250W | 6MiB / 4096MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 2 NVIDIA GeForce ... Off | 00000000:0B:00.0 Off | N/A | | 50% 26C P8 14W / 250W | 6MiB / 4096MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| +-----------------------------------------------------------------------------+
Same error.
At this point, i tried installing the OpenCL drivers per the installation as well as the https://github.com/f0cker/crackq/wiki/Install-on-Ubuntu workaround for Nvidia. However, the error remains.
Did you remove the old docker containers/images? I have the following confirmed working on a test AWS box:
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A10G Off | 00000000:00:1E.0 Off | 0 |
| 0% 27C P0 53W / 300W | 4MiB / 23028MiB | 4% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
Maybe try reinstalling now that the drivers are updated?
run this beforehand:
sudo docker system prune --all
**This will wipe all containers & images so do them individually if you have other containers on your server ;)
Yea, i removed all containers and even tried reboots at a time. I've added my steps in case something obvious stands out. Hopefully its still fixable, otherwise i might just reinstall Ubuntu.
1) Removing everything
root@localhost:/opt/crackq-0.1.2# sudo docker system prune --all
root@localhost:/opt/crackq-0.1.2# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
root@localhost:/opt/crackq-0.1.2# reboot
2) After cleaning up and rebooting:
root@localhost:/opt# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
root@localhost:/opt/crackq-0.1.2# nvidia-smi
Fri Jan 19 09:25:13 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.146.02 Driver Version: 535.146.02 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 970 Off | 00000000:01:00.0 Off | N/A |
| 50% 32C P8 14W / 250W | 15MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce GTX 970 Off | 00000000:02:00.0 Off | N/A |
| 51% 30C P8 14W / 250W | 6MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce GTX 970 Off | 00000000:0B:00.0 Off | N/A |
| 50% 26C P8 14W / 250W | 6MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1099 G /usr/lib/xorg/Xorg 7MiB |
| 0 N/A N/A 1711 G /usr/bin/gnome-shell 3MiB |
| 1 N/A N/A 1099 G /usr/lib/xorg/Xorg 3MiB |
| 2 N/A N/A 1099 G /usr/lib/xorg/Xorg 3MiB |
+---------------------------------------------------------------------------------------+
root@localhost:/opt# git clone https://github.com/f0cker/crackq.git
3) Reinstalling:
root@localhost:/opt/crackq-0.1.2# sudo ./install.sh docker/nvidia/ubuntu
[+] Building 584.5s (13/13) FINISHED docker:default
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 1.07kB 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/nvidia/cuda:12.3.1-devel-ubuntu20.04 2.7s
=> [1/8] FROM docker.io/nvidia/cuda:12.3.1-devel-ubuntu20.04@sha256:befbdfddbb52727f9ce8d0c574cac0f631c606b1e6f0e523f3a0777fe2720c99 401.6s
=> => resolve docker.io/nvidia/cuda:12.3.1-devel-ubuntu20.04@sha256:befbdfddbb52727f9ce8d0c574cac0f631c606b1e6f0e523f3a0777fe2720c99 0.0s
=> => sha256:dc1dc0eb4de2ccea0627484435c928be534061f8f9352cc9b120329377144c43 2.63kB / 2.63kB 0.0s
=> => sha256:25ad149ed3cff49ddb57ceb4418377f63c897198de1f9de7a24506397822de3e 27.51MB / 27.51MB 14.7s
=> => sha256:befbdfddbb52727f9ce8d0c574cac0f631c606b1e6f0e523f3a0777fe2720c99 743B / 743B 0.0s
=> => sha256:ba7b66a9df40b8a1c1a41d58d7c3beaf33a50dc842190cd6a2b66e6f44c3b57b 7.94MB / 7.94MB 2.7s
=> => sha256:9ef37be4ff597ee885111c1dcd64d3a327ffb36b4e6c34d00bf0f81261c6affe 18.35kB / 18.35kB 0.0s
=> => sha256:520797292d9250932259d95f471bef1f97712030c1d364f3f297260e5fee1de8 57.07MB / 57.07MB 11.0s
=> => sha256:c5f2ffd06d8b1667c198d4f9a780b55c86065341328ab4f59d60dc996ccd5817 185B / 185B 3.2s
=> => sha256:1698c67699a3eee2a8fc185093664034bb69ab67c545ab6d976399d5500b2f44 6.88kB / 6.88kB 3.4s
=> => sha256:16dd7c0d35aac769895e18ff9096f7b348125aa75713fdf8cb6ba13e64421c42 1.28GB / 1.28GB 264.1s
=> => sha256:568cac1e538c782a9d9a813e11a1a42cc8e0bdaf9fa0d0671b84bd465e030418 62.53kB / 62.53kB 11.6s
=> => sha256:6252d19a7f1df1a9962d3a9efe147d15716f1312b1d5cea4bdf62bad52875c47 1.68kB / 1.68kB 12.1s
=> => sha256:f573e2686be4e101596edd9b23b30b629e99c184d588c4d1b4976d5d87c031a4 1.52kB / 1.52kB 12.7s
=> => sha256:0074e75104ac1156e1ee8adaba4f241074bad55133c49c148baa528b16dca39a 2.54GB / 2.54GB 340.2s
=> => sha256:df35fae9e247886347e01c7d57f33cc053bb58989ef8b42147191d2659d18276 86.30kB / 86.30kB 15.2s
=> => extracting sha256:25ad149ed3cff49ddb57ceb4418377f63c897198de1f9de7a24506397822de3e 1.6s
=> => extracting sha256:ba7b66a9df40b8a1c1a41d58d7c3beaf33a50dc842190cd6a2b66e6f44c3b57b 0.4s
=> => extracting sha256:520797292d9250932259d95f471bef1f97712030c1d364f3f297260e5fee1de8 1.9s
=> => extracting sha256:c5f2ffd06d8b1667c198d4f9a780b55c86065341328ab4f59d60dc996ccd5817 0.0s
=> => extracting sha256:1698c67699a3eee2a8fc185093664034bb69ab67c545ab6d976399d5500b2f44 0.0s
=> => extracting sha256:16dd7c0d35aac769895e18ff9096f7b348125aa75713fdf8cb6ba13e64421c42 34.4s
=> => extracting sha256:568cac1e538c782a9d9a813e11a1a42cc8e0bdaf9fa0d0671b84bd465e030418 0.1s
=> => extracting sha256:6252d19a7f1df1a9962d3a9efe147d15716f1312b1d5cea4bdf62bad52875c47 0.0s
=> => extracting sha256:f573e2686be4e101596edd9b23b30b629e99c184d588c4d1b4976d5d87c031a4 0.0s
=> => extracting sha256:0074e75104ac1156e1ee8adaba4f241074bad55133c49c148baa528b16dca39a 60.9s
=> => extracting sha256:df35fae9e247886347e01c7d57f33cc053bb58989ef8b42147191d2659d18276 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 5.45kB 0.0s
=> [2/8] RUN apt-get update -q && apt-get install --no-install-recommends -yq wget unzip clinfo libminizip-dev && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* 14.7s
=> [3/8] RUN apt-get update && apt-get install -y wget p7zip gcc g++ make build-essential git libcurl4-openssl-dev libssl-dev zlib1g-dev python3.8 python3.8-dev python3-pip 29.0s
=> [4/8] COPY . /opt/crackq/build 0.1s
=> [5/8] WORKDIR /opt/crackq/build 0.1s
=> [6/8] RUN "/opt/crackq/build"/setup.sh 122.2s
=> [7/8] RUN chown -R 1111:1111 "/opt/crackq/build"/ 9.9s
=> [8/8] WORKDIR /opt/crackq/build/ 0.1s
=> exporting to image 4.0s
=> => exporting layers 4.0s
=> => writing image sha256:502b0a321e0dc6543af7719d380d3f15f684c99e6947e7d54c434470a1e92e4c 0.0s
=> => naming to docker.io/library/nvidia-crackq 0.0s
4) Made the crack pythons secrets change and launched instance:
root@localhost:/opt/crackq# sudo docker compose -f docker-compose.nvidia.yml up --build
WARN[0000] The "MAIL_USERNAME" variable is not set. Defaulting to a blank string.
WARN[0000] The "MAIL_PASSWORD" variable is not set. Defaulting to a blank string.
[+] Running 3/9
⠼ redis 8 layers [⣄⣿⣿⣿⡀⠀⠀⠀] 14.09MB/49.96MB Pulling 5.4s
⠙ 2f44b7a888fa Downloading [===================> ] 11.32MB/29.13M
root@localhost:/opt/crackq-0.1.2# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4ca7de38aff8 nginx-crackq "/docker-entrypoint.…" About a minute ago Up About a minute 80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp nginx
e7f37f89c7ba nvidia-ubuntu "/opt/nvidia/nvidia_…" About a minute ago Up About a minute 6379/tcp, 8081/tcp, 127.0.0.1:8080->8080/tcp crackq
18ba32ad8c6c redis:latest "docker-entrypoint.s…" About a minute ago Up About a minute 127.0.0.1:6379->6379/tcp
5) Checking Nvidia-SMI for the container:
root@localhost:/opt/crackq-0.1.2# sudo docker exec -it crackq nvidia-smi
Thu Jan 18 22:41:24 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.146.02 Driver Version: 535.146.02 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 970 Off | 00000000:01:00.0 Off | N/A |
| 50% 32C P8 14W / 250W | 15MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce GTX 970 Off | 00000000:02:00.0 Off | N/A |
| 51% 29C P8 14W / 250W | 6MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce GTX 970 Off | 00000000:0B:00.0 Off | N/A |
| 51% 25C P8 14W / 250W | 6MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
+---------------------------------------------------------------------------------------+
6) When doing the benchmark it fails again similar to before:
root@localhost:/opt/crackq-0.1.2# sudo docker exec -it crackq /opt/crackq/build/pyhashcat/pyhashcat/hashcat/hashcat -b -m 1000
hashcat (v6.2.1) starting in benchmark mode...
Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.
cuInit(): forward compatibility was attempted on non supported HW
clGetPlatformIDs(): CL_PLATFORM_NOT_FOUND_KHR
ATTENTION! No OpenCL-compatible or CUDA-compatible platform found.
You are probably missing the OpenCL or CUDA runtime installation.
* AMD GPUs on Linux require this driver:
"RadeonOpenCompute (ROCm)" Software Platform (3.1 or later)
* Intel CPUs require this runtime:
"OpenCL Runtime for Intel Core and Intel Xeon Processors" (16.1.1 or later)
* NVIDIA GPUs require this runtime and/or driver (both):
"NVIDIA Driver" (440.64 or later)
"CUDA Toolkit" (9.0 or later)
Started: Thu Jan 18 22:41:49 2024
Stopped: Thu Jan 18 22:41:49 2024
7) Checking CUDA and Nvidia for container:
root@localhost:/tmp# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
root@localhost:/tmp# sudo docker exec -it crackq nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Nov__3_17:16:49_PDT_2023
Cuda compilation tools, release 12.3, V12.3.103
Build cuda_12.3.r12.3/compiler.33492891_0
I really cant figure out what i'm missing. Any idea what else i can try before going down the OS reinstall route?
I don't think you need to reinstall the OS. It's something around this error:
cuInit(): forward compatibility was attempted on non supported HW
I'll do some digging when I have time, but in the meantime here's all the version info from my test box:
sudo docker --version
Docker version 24.0.7, build afdd53b
Host driver version:
nvidia-smi
Mon Jan 22 10:20:52 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
Container driver version:
sudo docker exec -it crackq nvidia-smi
Mon Jan 22 10:19:13 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
nvidia-container-runtime --version
NVIDIA Container Runtime version 1.13.5
commit: 6b8589dcb4dead72ab64f14a5912886e6165c079
spec: 1.1.0-rc.2
runc version 1.1.10
commit: v1.1.10-0-g18a0cb0
spec: 1.0.2-dev
go: go1.20.12
libseccomp: 2.5.3
nvidia-container-toolkit -version
NVIDIA Container Runtime Hook version 1.13.5
commit: 6b8589dcb4dead72ab64f14a5912886e6165c079
Just checking, have you rebooted since updating the drivers? :D
Where did you install the Nvidia driver from, direct download or from a repo? Latest version should be 535.154.05 for RTX I believe
Too many times :D. I rebooted now again to check but it is still the same.
Looking at the installs everything except my Nvidia driver is from the github repo: Hashcat is working locally using that driver through.
nvidia-container-toolkit
root@localhost:/opt# apt info nvidia-container-toolkit
Package: nvidia-container-toolkit
Version: 1.13.5-1
Priority: optional
Section: utils
Maintainer: NVIDIA CORPORATION <cudatools@nvidia.com>
Installed-Size: 2,425 kB
Depends: nvidia-container-toolkit-base (= 1.13.5-1), libnvidia-container-tools (>= 1.13.5-1), libnvidia-container-tools (<< 2.0.0), libseccomp2
Breaks: nvidia-container-runtime (<= 3.5.0-1), nvidia-container-runtime-hook
Replaces: nvidia-container-runtime (<= 3.5.0-1), nvidia-container-runtime-hook
Homepage: https://github.com/NVIDIA/nvidia-container-toolkit
Download-Size: 853 kB
APT-Manual-Installed: yes
APT-Sources: https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 Packages
Description: NVIDIA Container toolkit
Provides tools and utilities to enable GPU support in containers.
and for Nvidia-Container-Runtime:
root@localhost:/opt# apt info nvidia-container-runtime
Package: nvidia-container-runtime
Version: 3.13.0-1
Priority: optional
Section: utils
Maintainer: NVIDIA CORPORATION <cudatools@nvidia.com>
Installed-Size: 21.5 kB
Depends: nvidia-container-toolkit (>= 1.13.0-1), nvidia-container-toolkit (<< 2.0.0)
Homepage: https://github.com/NVIDIA/nvidia-container-runtime/wiki
Download-Size: 4,988 B
APT-Manual-Installed: yes
APT-Sources: https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 Packages
Description: NVIDIA container runtime
Provides a modified version of runc allowing users to run GPU enabled
containers.
The only driver which seems to come from the Ubuntu repo is my actual Nvidia driver:
root@localhost:/opt/# apt info nvidia-driver-535
Package: nvidia-driver-535
Version: 535.154.05-0ubuntu0.20.04.1
Priority: optional
Section: restricted/libs
Source: nvidia-graphics-drivers-535
Origin: Ubuntu
Maintainer: Ubuntu Core Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Support: PB
Phased-Update-Percentage: 30
Download-Size: 483 kB
APT-Manual-Installed: yes
APT-Sources: http://au.archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages
Description: NVIDIA driver metapackage
This metapackage depends on the NVIDIA binary driver and on all of its libraries,
to provide hardware acceleration for OpenGL/GLX/EGL/GLES/Vulkan
applications on either X11 or on Wayland.
I can try direct and see if that makes a difference.
Yeah try removing the driver and installing the latest from Nvidia: https://www.nvidia.com/Download/index.aspx
Really hoped it would fix it.
Ive purged the Nvidia drivers,stopped and removed the docker containers. Downloaded and installed the 535.129.03 version successfully from official Nvidia site.
Rebooted to confirm its still seen in Ubuntu. Did local hashcat benchmark which worked perfectly. Reinstalled the containers, composed them afterwards.
The correct driver is showing:
Thu Jan 25 06:04:35 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 970 Off | 00000000:01:00.0 Off | N/A |
| 50% 32C P8 14W / 250W | 15MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce GTX 970 Off | 00000000:02:00.0 Off | N/A |
| 51% 29C P8 14W / 250W | 6MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce GTX 970 Off | 00000000:0B:00.0 Off | N/A |
| 50% 26C P8 14W / 250W | 6MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
+---------------------------------------------------------------------------------------+
But when i try the benchmark:
root@localhost:/opt# sudo docker exec -it crackq /opt/crackq/build/pyhashcat/pyhashcat/hashcat/hashcat -b -m 1000
hashcat (v6.2.1) starting in benchmark mode...
Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.
cuInit(): forward compatibility was attempted on non supported HW
clGetPlatformIDs(): CL_PLATFORM_NOT_FOUND_KHR
ATTENTION! No OpenCL-compatible or CUDA-compatible platform found.
You are probably missing the OpenCL or CUDA runtime installation.
* AMD GPUs on Linux require this driver:
"RadeonOpenCompute (ROCm)" Software Platform (3.1 or later)
* Intel CPUs require this runtime:
"OpenCL Runtime for Intel Core and Intel Xeon Processors" (16.1.1 or later)
* NVIDIA GPUs require this runtime and/or driver (both):
"NVIDIA Driver" (440.64 or later)
"CUDA Toolkit" (9.0 or later)
Started: Thu Jan 25 06:04:44 2024
Stopped: Thu Jan 25 06:04:44 2024
When run locally, it works:
root@localhost:/opt# hashcat -b -m 100
hashcat (v5.1.0) starting in benchmark mode...
Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.
* Device #1: WARNING! Kernel exec timeout is not disabled.
This may cause "CL_OUT_OF_RESOURCES" or related errors.
To disable the timeout, see: https://hashcat.net/q/timeoutpatch
* Device #2: WARNING! Kernel exec timeout is not disabled.
This may cause "CL_OUT_OF_RESOURCES" or related errors.
To disable the timeout, see: https://hashcat.net/q/timeoutpatch
* Device #3: WARNING! Kernel exec timeout is not disabled.
This may cause "CL_OUT_OF_RESOURCES" or related errors.
To disable the timeout, see: https://hashcat.net/q/timeoutpatch
OpenCL Platform #1: NVIDIA Corporation
======================================
* Device #1: NVIDIA GeForce GTX 970, 1009/4036 MB allocatable, 13MCU
* Device #2: NVIDIA GeForce GTX 970, 1009/4036 MB allocatable, 13MCU
* Device #3: NVIDIA GeForce GTX 970, 1009/4036 MB allocatable, 13MCU
Checking drivers in container:
root@localhost:/opt# sudo docker exec -it crackq nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Nov__3_17:16:49_PDT_2023
Cuda compilation tools, release 12.3, V12.3.103
Build cuda_12.3.r12.3/compiler.33492891_0
It's a shame I can't reproduce this to debug myself, I've tried 3 different Nvidia GPU models in AWS which all work fine. So I think it's specific to GTX/RTX models, but would be good to know if anyone else has been able to get these models working with the latest version of crackq. If you're willing to try, since it's an issue on the Nvidia/docker side, my next step would be trying different docker images to the one I've setup.
So, within this file: ./docker/nvidia/ubuntu/Dockerfile
On line 1 replace the image name with one of the following and rebuild and test, then try others if no good:
FROM nvidia/cuda:12.2.2-devel-ubuntu20.04 FROM nvidia/cuda:12.3.1-devel-ubuntu22.04 FROM nvidia/cuda:12.3.1-runtime-ubuntu22.04 FROM nvidia/cuda:12.2.2-runtime-ubuntu20.04
Hey, so i had some time to debug today following your recommendations. I'm happy to say that you were spot on 🥇. Once i swapped over to nvidia/cuda:12.2.2-devel-ubuntu20.04, everything worked.
root@localhost:/opt/# sudo docker exec -it crackq /opt/crackq/build/pyhashcat/pyhashcat/hashcat/hashcat -b -m 1000
hashcat (v6.2.1) starting in benchmark mode...
Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.
clGetPlatformIDs(): CL_PLATFORM_NOT_FOUND_KHR
* Device #1: WARNING! Kernel exec timeout is not disabled.
This may cause "CL_OUT_OF_RESOURCES" or related errors.
To disable the timeout, see: https://hashcat.net/q/timeoutpatch
* Device #2: WARNING! Kernel exec timeout is not disabled.
This may cause "CL_OUT_OF_RESOURCES" or related errors.
To disable the timeout, see: https://hashcat.net/q/timeoutpatch
* Device #3: WARNING! Kernel exec timeout is not disabled.
This may cause "CL_OUT_OF_RESOURCES" or related errors.
To disable the timeout, see: https://hashcat.net/q/timeoutpatch
CUDA API (CUDA 12.2)
====================
* Device #1: NVIDIA GeForce GTX 970, 3961/4036 MB, 13MCU
* Device #2: NVIDIA GeForce GTX 970, 3969/4036 MB, 13MCU
* Device #3: NVIDIA GeForce GTX 970, 3969/4036 MB, 13MCU
Benchmark relevant options:
===========================
* --optimized-kernel-enable
Hashmode: 1000 - NTLM
Checked it in the GUI:
Thank you so much, I'm glad to be able to mark this as closed.
That's awesome, thanks for taking the time to help me debug this. I'll rollback the image version and push an update soon.
Describe the bug Hey, When creating a new job using 0.1.2 and Ubuntu + NVIDIA the job is scheduled but fails immediately afterwards. A look at the console error message lists the Speed check failed error entry. This affects both with/without brain enabled. Ive tried launching new and previously cracked jobs. The issue doesn't seem to affect opencl, which works fine. So Im thinking its potentially an issue related to hashcat/nvidia. The issue mostly seems to come from run_hashcat.py i think.
Version
Running instances
Nvidia SMI
Crash Error - Brain Enabled
Crash Error - Brain Disabled
To Reproduce Steps to reproduce the behavior:
Expected behavior I would really like it to complete the job successfully -even a temp solution.
Screenshots Error message in the console is listed as:
Additional context This was an upgrade from 0.1.0 -> 0.1.1 -> 0.1.2. 0.1.1 Never worked for me due to the library issues.