f0cker / crackq

CrackQ: A Python Hashcat cracking queue system
MIT License
922 stars 101 forks source link

Error with Created Jobs Nvidia #43

Closed CarbonTrinity closed 9 months ago

CarbonTrinity commented 9 months ago

Describe the bug Hey, When creating a new job using 0.1.2 and Ubuntu + NVIDIA the job is scheduled but fails immediately afterwards. A look at the console error message lists the Speed check failed error entry. This affects both with/without brain enabled. Ive tried launching new and previously cracked jobs. The issue doesn't seem to affect opencl, which works fine. So Im thinking its potentially an issue related to hashcat/nvidia. The issue mostly seems to come from run_hashcat.py i think.

Version

root@localhost:/opt/crackq-0.1.2# uname -a
Linux localhost 5.15.0-91-generic #101~20.04.1-Ubuntu SMP Thu Nov 16 14:22:28 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Running instances

root@localhost:/opt/crackq-0.1.2/build# docker ps
CONTAINER ID   IMAGE           COMMAND                  CREATED          STATUS          PORTS                                           NAMES
50c58e6fa8af   nginx-crackq    "/docker-entrypoint.…"   28 seconds ago   Up 27 seconds   80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp   nginx
34c0de1d67a7   nvidia-ubuntu   "/opt/nvidia/nvidia_…"   29 seconds ago   Up 28 seconds   6379/tcp, 8081/tcp, 127.0.0.1:8080->8080/tcp    crackq
e734a0eb4655   redis:latest    "docker-entrypoint.s…"   29 seconds ago   Up 28 seconds   127.0.0.1:6379->6379/tcp

Nvidia SMI


root@localhost:/opt/crackq-0.1.2/build# nvidia-smi
Wed Jan 17 11:27:05 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.223.02   Driver Version: 470.223.02   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| 50%   31C    P8    14W / 250W |     14MiB /  4043MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |
| 51%   28C    P8    14W / 250W |      5MiB /  4043MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA GeForce ...  Off  | 00000000:0B:00.0 Off |                  N/A |
| 50%   25C    P8    13W / 250W |      5MiB /  4043MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1083      G   /usr/lib/xorg/Xorg                  7MiB |
|    0   N/A  N/A      1693      G   /usr/bin/gnome-shell                2MiB |
|    1   N/A  N/A      1083      G   /usr/lib/xorg/Xorg                  3MiB |
|    2   N/A  N/A      1083      G   /usr/lib/xorg/Xorg                  3MiB |
+-----------------------------------------------------------------------------+

Crash Error - Brain Enabled

root@localhost:/opt/crackq-0.1.2/build# screen -r PW
crackq    | 00:30:20 speed_check: crackq.run_hashcat.show_speed(attack_mode=0, brain=True, hash_file='/var/crackq/logs/4108b8eab7ef475093bd9f8e4190a9a4.hashes', hash_mode=100, mask='?a?a?a?a?a?a', name='Example4', pot_path='/var/crackq/logs/crackq.pot', session='4108b8eab7ef475093bd9f8e4190a9a4', speed_session='4108b8eab7ef475093bd9f8e4190a9a4_speed', username=False, wordlist2=None, wordlist='/var/crackq/files/rockyou.txt.gz') (4108b8eab7ef475093bd9f8e4190a9a4_speed)
crackq    | INFO     conf.py:18 hc_conf 2024-01-17 00:30:21,096 Reading from config file /var/crackq/files/crackq.conf
crackq    | [nltk_data] Downloading package wordnet to /home/crackq/nltk_data...
crackq    | [nltk_data]   Package wordnet is already up-to-date!
crackq    | INFO     conf.py:18 hc_conf 2024-01-17 00:30:21,272 Reading from config file /var/crackq/files/crackq.conf
crackq    | INFO     conf.py:18 hc_conf 2024-01-17 00:30:21,307 Reading from config file /var/crackq/files/crackq.conf
crackq    | INFO     schemas.py:1260 include_schema 2024-01-17 00:30:21,557 Resource 'XMLSchema.xsd' is already loaded
crackq    | INFO     conf.py:18 hc_conf 2024-01-17 00:30:23,147 Reading from config file /var/crackq/files/crackq.conf
crackq    | INFO     run_hashcat.py:116 runner 2024-01-17 00:30:23,168 Running hashcat
crackq    | INFO     crackqueue.py:47 q_add 2024-01-17 00:30:23,513 Adding task to job queue: 4108b8eab7ef475093bd9f8e4190a9a4
crackq    | 00:30:23 default: crackq.run_hashcat.hc_worker(attack_mode=0, brain=True, hash_file='/var/crackq/logs/4108b8eab7ef475093bd9f8e4190a9a4.hashes', hash_mode=100, increment=False, increment_max=None, increment_min=None, mask=None, mask_file=False, name='Example4', outfile='/var/crackq/logs/4108b8eab7ef475093bd9f8e4190a9a4.cracked', pot_path='/var/crackq/logs/crackq.pot', potcheck=False, restore=0, rules=[], session='4108b8eab7ef475093bd9f8e4190a9a4', username=False, wordlist2=None, wordlist='/var/crackq/files/rockyou.txt.gz') (4108b8eab7ef475093bd9f8e4190a9a4)
redis     | 1:M 17 Jan 2024 00:30:23.528 * 100 changes in 300 seconds. Saving...
redis     | 1:M 17 Jan 2024 00:30:23.529 * Background saving started by pid 21
nginx     | 10.255.255.10 - - [17/Jan/2024:00:30:23 +0000] "POST /api/add HTTP/1.1" 202 32 "https://192.168.1.1/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0" "-"
redis     | 21:C 17 Jan 2024 00:30:23.541 * DB saved on disk
redis     | 21:C 17 Jan 2024 00:30:23.543 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB
redis     | 1:M 17 Jan 2024 00:30:23.630 * Background saving terminated with success
nginx     | 10.255.255.10 - - [17/Jan/2024:00:30:23 +0000] "GET /api/queuing/all HTTP/1.1" 200 375 "https://192.168.1.1/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0" "-"
crackq    | INFO     conf.py:18 hc_conf 2024-01-17 00:30:24,141 Reading from config file /var/crackq/files/crackq.conf
crackq    | INFO     run_hashcat.py:116 runner 2024-01-17 00:30:24,171 Running hashcat
crackq    | [nltk_data] Downloading package wordnet to /home/crackq/nltk_data...
crackq    | [nltk_data]   Package wordnet is already up-to-date!
crackq    | INFO     conf.py:18 hc_conf 2024-01-17 00:30:24,325 Reading from config file /var/crackq/files/crackq.conf
crackq    | INFO     conf.py:18 hc_conf 2024-01-17 00:30:24,359 Reading from config file /var/crackq/files/crackq.conf
crackq    | INFO     schemas.py:1260 include_schema 2024-01-17 00:30:24,610 Resource 'XMLSchema.xsd' is already loaded
crackq    | INFO     conf.py:18 hc_conf 2024-01-17 00:30:26,198 Reading from config file /var/crackq/files/crackq.conf
crackq    | INFO     run_hashcat.py:116 runner 2024-01-17 00:30:26,215 Running hashcat
crackq    | ERROR    run_hashcat.py:188 runner 2024-01-17 00:30:26,253 Speed check failed: RuntimeError
crackq    |
crackq    | The above exception was the direct cause of the following exception:
crackq    |
crackq    | Traceback (most recent call last):
crackq    |   File "/usr/local/lib/python3.8/dist-packages/rq/worker.py", line 1418, in perform_job
crackq    |     rv = job.perform()
crackq    |   File "/usr/local/lib/python3.8/dist-packages/rq/job.py", line 1222, in perform
crackq    |     self._result = self._execute()
crackq    |   File "/usr/local/lib/python3.8/dist-packages/rq/job.py", line 1259, in _execute
crackq    |     result = self.func(*self.args, **self.kwargs)
crackq    |   File "/opt/crackq/build/crackq/run_hashcat.py", line 988, in show_speed
crackq    |     hcat = runner(hash_file=hash_file, mask=mask,
crackq    |   File "/opt/crackq/build/crackq/run_hashcat.py", line 159, in runner
crackq    |     hc.hashcat_session_execute()
crackq    | SystemError: <method 'hashcat_session_execute' of 'pyhashcat.hashcat' objects> returned a result with an error set

Crash Error - Brain Disabled

crackq    | INFO     conf.py:18 hc_conf 2024-01-17 00:32:00,424 Reading from config file /var/crackq/files/crackq.conf
crackq    | [nltk_data] Downloading package wordnet to /home/crackq/nltk_data...
crackq    | INFO     run_hashcat.py:116 runner 2024-01-17 00:32:00,468 Running hashcat
crackq    | [nltk_data]   Package wordnet is already up-to-date!
crackq    | INFO     conf.py:18 hc_conf 2024-01-17 00:32:00,610 Reading from config file /var/crackq/files/crackq.conf
crackq    | INFO     conf.py:18 hc_conf 2024-01-17 00:32:00,646 Reading from config file /var/crackq/files/crackq.conf
crackq    | INFO     schemas.py:1260 include_schema 2024-01-17 00:32:00,903 Resource 'XMLSchema.xsd' is already loaded
crackq    | INFO     conf.py:18 hc_conf 2024-01-17 00:32:02,478 Reading from config file /var/crackq/files/crackq.conf
crackq    | INFO     run_hashcat.py:116 runner 2024-01-17 00:32:02,498 Running hashcat
crackq    | ERROR    run_hashcat.py:188 runner 2024-01-17 00:32:02,537 Speed check failed: RuntimeError
crackq    |
crackq    | The above exception was the direct cause of the following exception:
crackq    |
crackq    | Traceback (most recent call last):
crackq    |   File "/usr/local/lib/python3.8/dist-packages/rq/worker.py", line 1418, in perform_job
crackq    |     rv = job.perform()
crackq    |   File "/usr/local/lib/python3.8/dist-packages/rq/job.py", line 1222, in perform
crackq    |     self._result = self._execute()
crackq    |   File "/usr/local/lib/python3.8/dist-packages/rq/job.py", line 1259, in _execute
crackq    |     result = self.func(*self.args, **self.kwargs)
crackq    |   File "/opt/crackq/build/crackq/run_hashcat.py", line 988, in show_speed
crackq    |     hcat = runner(hash_file=hash_file, mask=mask,
crackq    |   File "/opt/crackq/build/crackq/run_hashcat.py", line 159, in runner
crackq    |     hc.hashcat_session_execute()
crackq    | SystemError: <method 'hashcat_session_execute' of 'pyhashcat.hashcat' objects> returned a result with an error set

To Reproduce Steps to reproduce the behavior:

  1. Login to web and queue new job. [Note ive tried both new and previously cracked jobs]
  2. Enable/Disable brain [error message is the same]
  3. Submit Request
  4. Job scheduled
  5. See error on back end/ See error on failed jobs.

Expected behavior I would really like it to complete the job successfully -even a temp solution.

Screenshots Error message in the console is listed as: ISSUE

Additional context This was an upgrade from 0.1.0 -> 0.1.1 -> 0.1.2. 0.1.1 Never worked for me due to the library issues.

f0cker commented 9 months ago

Thanks for reporting this. What command are you using to run compose? It looks like an nvidia runtime issue so it's not seeing the cards within the docker containers. I updated the install steps on the Wiki, did you reinstall docker/compose/nvidia runtime?

f0cker commented 9 months ago

While the containers are up, you can run the following to check the crackq container is able to see the GPU cards:

sudo docker exec -it crackq nvidia-smi

sudo docker exec -it crackq /opt/crackq/build/pyhashcat/pyhashcat/hashcat/hashcat -b -m 1000

CarbonTrinity commented 9 months ago

Yes i went through all the steps again, just to double check i didnt miss something obvious.Ive rebuilt and rerun again but it still seems to be failing. I'm still new with docker, thus the issue is probably my own fault. I've added the steps i tested today [Error message is at step 7]:

Steps

1) Reinstall Docker following Install Docker CE on Ubuntu https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository 2) Reinstall Ubuntu Nvidia Runtime Install -> No errors here 3) Install Docker OS sudo ./install.sh docker/nvidia/ubuntu 4) Change secret key python3 -c 'import secrets; print(secrets.token_urlsafe())' 5) Relaunch Compose sudo docker compose -f docker-compose.nvidia.yml up --build Note: [previously used sudo docker-compose -f docker-compose.nvidia.yml up --build as per wiki]

6) Check Nvidia-smi

sudo docker exec -it crackq nvidia-smi

Wed Jan 17 23:21:08 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.223.02   Driver Version: 470.223.02   CUDA Version: 12.3     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| 50%   33C    P8    14W / 250W |     14MiB /  4043MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |
| 51%   30C    P8    14W / 250W |      6MiB /  4043MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA GeForce ...  Off  | 00000000:0B:00.0 Off |                  N/A |
| 50%   26C    P8    13W / 250W |      6MiB /  4043MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+
  1. Got an error when trying to do the benchmark:
    
    root@localhost-new:/opt/crackq-0.1.2# sudo docker exec -it crackq /opt/crackq/build/pyhashcat/pyhashcat/hashcat/hashcat -b -m 1000
    hashcat (v6.2.1) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default. You can use it in your cracking session by setting the -O option. Note: Using optimized kernel code limits the maximum supported password length. To disable the optimized kernel code in benchmark mode, use the -w option.

cuInit(): forward compatibility was attempted on non supported HW

clGetPlatformIDs(): CL_PLATFORM_NOT_FOUND_KHR

ATTENTION! No OpenCL-compatible or CUDA-compatible platform found.

You are probably missing the OpenCL or CUDA runtime installation.

Started: Wed Jan 17 23:22:40 2024 Stopped: Wed Jan 17 23:22:40 2024


All the containers are still up though and on the previous command i can see both CUDA and Nvidia detected.
I  also tried updating the drivers:

root@localhost-new:/opt/crackq-0.1.2# sudo docker exec -it crackq nvidia-smi Thu Jan 18 00:16:31 2024 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.3 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A | | 50% 32C P8 14W / 250W | 15MiB / 4096MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA GeForce ... Off | 00000000:02:00.0 Off | N/A | | 51% 29C P8 14W / 250W | 6MiB / 4096MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 2 NVIDIA GeForce ... Off | 00000000:0B:00.0 Off | N/A | | 50% 26C P8 14W / 250W | 6MiB / 4096MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| +-----------------------------------------------------------------------------+


Same error.

At this point, i tried installing the OpenCL drivers per the installation as well as the https://github.com/f0cker/crackq/wiki/Install-on-Ubuntu workaround for  Nvidia. However, the error remains. 
f0cker commented 9 months ago

Did you remove the old docker containers/images? I have the following confirmed working on a test AWS box:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A10G                    Off | 00000000:00:1E.0 Off |                    0 |
|  0%   27C    P0              53W / 300W |      4MiB / 23028MiB |      4%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Maybe try reinstalling now that the drivers are updated?

run this beforehand: sudo docker system prune --all **This will wipe all containers & images so do them individually if you have other containers on your server ;)

CarbonTrinity commented 9 months ago

Yea, i removed all containers and even tried reboots at a time. I've added my steps in case something obvious stands out. Hopefully its still fixable, otherwise i might just reinstall Ubuntu.

1) Removing everything

root@localhost:/opt/crackq-0.1.2# sudo docker system prune --all

root@localhost:/opt/crackq-0.1.2# docker ps -a
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

root@localhost:/opt/crackq-0.1.2# reboot

2) After cleaning up and rebooting:

root@localhost:/opt# docker ps -a
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

root@localhost:/opt/crackq-0.1.2# nvidia-smi
Fri Jan 19 09:25:13 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.146.02             Driver Version: 535.146.02   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 970         Off | 00000000:01:00.0 Off |                  N/A |
| 50%   32C    P8              14W / 250W |     15MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce GTX 970         Off | 00000000:02:00.0 Off |                  N/A |
| 51%   30C    P8              14W / 250W |      6MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   2  NVIDIA GeForce GTX 970         Off | 00000000:0B:00.0 Off |                  N/A |
| 50%   26C    P8              14W / 250W |      6MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1099      G   /usr/lib/xorg/Xorg                            7MiB |
|    0   N/A  N/A      1711      G   /usr/bin/gnome-shell                          3MiB |
|    1   N/A  N/A      1099      G   /usr/lib/xorg/Xorg                            3MiB |
|    2   N/A  N/A      1099      G   /usr/lib/xorg/Xorg                            3MiB |
+---------------------------------------------------------------------------------------+

root@localhost:/opt# git clone https://github.com/f0cker/crackq.git

3) Reinstalling:

root@localhost:/opt/crackq-0.1.2# sudo ./install.sh docker/nvidia/ubuntu

[+] Building 584.5s (13/13) FINISHED                                                                                                                                           docker:default
 => [internal] load build definition from Dockerfile                                                                                                                                     0.0s
 => => transferring dockerfile: 1.07kB                                                                                                                                                   0.0s
 => [internal] load .dockerignore                                                                                                                                                        0.0s
 => => transferring context: 2B                                                                                                                                                          0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:12.3.1-devel-ubuntu20.04                                                                                                          2.7s
 => [1/8] FROM docker.io/nvidia/cuda:12.3.1-devel-ubuntu20.04@sha256:befbdfddbb52727f9ce8d0c574cac0f631c606b1e6f0e523f3a0777fe2720c99                                                  401.6s
 => => resolve docker.io/nvidia/cuda:12.3.1-devel-ubuntu20.04@sha256:befbdfddbb52727f9ce8d0c574cac0f631c606b1e6f0e523f3a0777fe2720c99                                                    0.0s
 => => sha256:dc1dc0eb4de2ccea0627484435c928be534061f8f9352cc9b120329377144c43 2.63kB / 2.63kB                                                                                           0.0s
 => => sha256:25ad149ed3cff49ddb57ceb4418377f63c897198de1f9de7a24506397822de3e 27.51MB / 27.51MB                                                                                        14.7s
 => => sha256:befbdfddbb52727f9ce8d0c574cac0f631c606b1e6f0e523f3a0777fe2720c99 743B / 743B                                                                                               0.0s
 => => sha256:ba7b66a9df40b8a1c1a41d58d7c3beaf33a50dc842190cd6a2b66e6f44c3b57b 7.94MB / 7.94MB                                                                                           2.7s
 => => sha256:9ef37be4ff597ee885111c1dcd64d3a327ffb36b4e6c34d00bf0f81261c6affe 18.35kB / 18.35kB                                                                                         0.0s
 => => sha256:520797292d9250932259d95f471bef1f97712030c1d364f3f297260e5fee1de8 57.07MB / 57.07MB                                                                                        11.0s
 => => sha256:c5f2ffd06d8b1667c198d4f9a780b55c86065341328ab4f59d60dc996ccd5817 185B / 185B                                                                                               3.2s
 => => sha256:1698c67699a3eee2a8fc185093664034bb69ab67c545ab6d976399d5500b2f44 6.88kB / 6.88kB                                                                                           3.4s
 => => sha256:16dd7c0d35aac769895e18ff9096f7b348125aa75713fdf8cb6ba13e64421c42 1.28GB / 1.28GB                                                                                         264.1s
 => => sha256:568cac1e538c782a9d9a813e11a1a42cc8e0bdaf9fa0d0671b84bd465e030418 62.53kB / 62.53kB                                                                                        11.6s
 => => sha256:6252d19a7f1df1a9962d3a9efe147d15716f1312b1d5cea4bdf62bad52875c47 1.68kB / 1.68kB                                                                                          12.1s
 => => sha256:f573e2686be4e101596edd9b23b30b629e99c184d588c4d1b4976d5d87c031a4 1.52kB / 1.52kB                                                                                          12.7s
 => => sha256:0074e75104ac1156e1ee8adaba4f241074bad55133c49c148baa528b16dca39a 2.54GB / 2.54GB                                                                                         340.2s
 => => sha256:df35fae9e247886347e01c7d57f33cc053bb58989ef8b42147191d2659d18276 86.30kB / 86.30kB                                                                                        15.2s
 => => extracting sha256:25ad149ed3cff49ddb57ceb4418377f63c897198de1f9de7a24506397822de3e                                                                                                1.6s
 => => extracting sha256:ba7b66a9df40b8a1c1a41d58d7c3beaf33a50dc842190cd6a2b66e6f44c3b57b                                                                                                0.4s
 => => extracting sha256:520797292d9250932259d95f471bef1f97712030c1d364f3f297260e5fee1de8                                                                                                1.9s
 => => extracting sha256:c5f2ffd06d8b1667c198d4f9a780b55c86065341328ab4f59d60dc996ccd5817                                                                                                0.0s
 => => extracting sha256:1698c67699a3eee2a8fc185093664034bb69ab67c545ab6d976399d5500b2f44                                                                                                0.0s
 => => extracting sha256:16dd7c0d35aac769895e18ff9096f7b348125aa75713fdf8cb6ba13e64421c42                                                                                               34.4s
 => => extracting sha256:568cac1e538c782a9d9a813e11a1a42cc8e0bdaf9fa0d0671b84bd465e030418                                                                                                0.1s
 => => extracting sha256:6252d19a7f1df1a9962d3a9efe147d15716f1312b1d5cea4bdf62bad52875c47                                                                                                0.0s
 => => extracting sha256:f573e2686be4e101596edd9b23b30b629e99c184d588c4d1b4976d5d87c031a4                                                                                                0.0s
 => => extracting sha256:0074e75104ac1156e1ee8adaba4f241074bad55133c49c148baa528b16dca39a                                                                                               60.9s
 => => extracting sha256:df35fae9e247886347e01c7d57f33cc053bb58989ef8b42147191d2659d18276                                                                                                0.0s
 => [internal] load build context                                                                                                                                                        0.0s
 => => transferring context: 5.45kB                                                                                                                                                      0.0s
 => [2/8] RUN apt-get update -q && apt-get install --no-install-recommends -yq wget unzip clinfo libminizip-dev     && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*   14.7s
 => [3/8] RUN apt-get update &&     apt-get install -y wget p7zip gcc g++ make build-essential git libcurl4-openssl-dev libssl-dev zlib1g-dev python3.8     python3.8-dev python3-pip   29.0s
 => [4/8] COPY . /opt/crackq/build                                                                                                                                                       0.1s
 => [5/8] WORKDIR /opt/crackq/build                                                                                                                                                      0.1s
 => [6/8] RUN "/opt/crackq/build"/setup.sh                                                                                                                                             122.2s
 => [7/8] RUN chown -R 1111:1111 "/opt/crackq/build"/                                                                                                                                    9.9s
 => [8/8] WORKDIR /opt/crackq/build/                                                                                                                                                     0.1s
 => exporting to image                                                                                                                                                                   4.0s
 => => exporting layers                                                                                                                                                                  4.0s
 => => writing image sha256:502b0a321e0dc6543af7719d380d3f15f684c99e6947e7d54c434470a1e92e4c                                                                                             0.0s
 => => naming to docker.io/library/nvidia-crackq                                                                                                                                         0.0s

4) Made the crack pythons secrets change and launched instance:

root@localhost:/opt/crackq# sudo docker compose -f docker-compose.nvidia.yml up --build
WARN[0000] The "MAIL_USERNAME" variable is not set. Defaulting to a blank string.
WARN[0000] The "MAIL_PASSWORD" variable is not set. Defaulting to a blank string.
[+] Running 3/9
 ⠼ redis 8 layers [⣄⣿⣿⣿⡀⠀⠀⠀] 14.09MB/49.96MB Pulling                                                                                                                                     5.4s
   ⠙ 2f44b7a888fa Downloading [===================>                               ]  11.32MB/29.13M

root@localhost:/opt/crackq-0.1.2# docker ps
CONTAINER ID   IMAGE           COMMAND                  CREATED              STATUS              PORTS                                           NAMES
4ca7de38aff8   nginx-crackq    "/docker-entrypoint.…"   About a minute ago   Up About a minute   80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp   nginx
e7f37f89c7ba   nvidia-ubuntu   "/opt/nvidia/nvidia_…"   About a minute ago   Up About a minute   6379/tcp, 8081/tcp, 127.0.0.1:8080->8080/tcp    crackq
18ba32ad8c6c   redis:latest    "docker-entrypoint.s…"   About a minute ago   Up About a minute   127.0.0.1:6379->6379/tcp  

5) Checking Nvidia-SMI for the container:

root@localhost:/opt/crackq-0.1.2# sudo docker exec -it crackq nvidia-smi
Thu Jan 18 22:41:24 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.146.02             Driver Version: 535.146.02   CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 970         Off | 00000000:01:00.0 Off |                  N/A |
| 50%   32C    P8              14W / 250W |     15MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce GTX 970         Off | 00000000:02:00.0 Off |                  N/A |
| 51%   29C    P8              14W / 250W |      6MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   2  NVIDIA GeForce GTX 970         Off | 00000000:0B:00.0 Off |                  N/A |
| 51%   25C    P8              14W / 250W |      6MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

6) When doing the benchmark it fails again similar to before:

root@localhost:/opt/crackq-0.1.2# sudo docker exec -it crackq /opt/crackq/build/pyhashcat/pyhashcat/hashcat/hashcat -b -m 1000
hashcat (v6.2.1) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

cuInit(): forward compatibility was attempted on non supported HW

clGetPlatformIDs(): CL_PLATFORM_NOT_FOUND_KHR

ATTENTION! No OpenCL-compatible or CUDA-compatible platform found.

You are probably missing the OpenCL or CUDA runtime installation.

* AMD GPUs on Linux require this driver:
  "RadeonOpenCompute (ROCm)" Software Platform (3.1 or later)
* Intel CPUs require this runtime:
  "OpenCL Runtime for Intel Core and Intel Xeon Processors" (16.1.1 or later)
* NVIDIA GPUs require this runtime and/or driver (both):
  "NVIDIA Driver" (440.64 or later)
  "CUDA Toolkit" (9.0 or later)

Started: Thu Jan 18 22:41:49 2024
Stopped: Thu Jan 18 22:41:49 2024

7) Checking CUDA and Nvidia for container:

root@localhost:/tmp# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
root@localhost:/tmp# sudo docker exec -it crackq nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Nov__3_17:16:49_PDT_2023
Cuda compilation tools, release 12.3, V12.3.103
Build cuda_12.3.r12.3/compiler.33492891_0

I really cant figure out what i'm missing. Any idea what else i can try before going down the OS reinstall route?

f0cker commented 9 months ago

I don't think you need to reinstall the OS. It's something around this error: cuInit(): forward compatibility was attempted on non supported HW

I'll do some digging when I have time, but in the meantime here's all the version info from my test box:

sudo docker --version
Docker version 24.0.7, build afdd53b

Host driver version:

nvidia-smi
Mon Jan 22 10:20:52 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+

Container driver version:

sudo docker exec -it crackq nvidia-smi
Mon Jan 22 10:19:13 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+

nvidia-container-runtime --version
NVIDIA Container Runtime version 1.13.5
commit: 6b8589dcb4dead72ab64f14a5912886e6165c079
spec: 1.1.0-rc.2

runc version 1.1.10
commit: v1.1.10-0-g18a0cb0
spec: 1.0.2-dev
go: go1.20.12
libseccomp: 2.5.3

nvidia-container-toolkit -version
NVIDIA Container Runtime Hook version 1.13.5
commit: 6b8589dcb4dead72ab64f14a5912886e6165c079
f0cker commented 9 months ago

Just checking, have you rebooted since updating the drivers? :D

f0cker commented 9 months ago

Where did you install the Nvidia driver from, direct download or from a repo? Latest version should be 535.154.05 for RTX I believe

CarbonTrinity commented 9 months ago

Too many times :D. I rebooted now again to check but it is still the same.

Looking at the installs everything except my Nvidia driver is from the github repo: Hashcat is working locally using that driver through.

nvidia-container-toolkit

root@localhost:/opt# apt info nvidia-container-toolkit
Package: nvidia-container-toolkit
Version: 1.13.5-1
Priority: optional
Section: utils
Maintainer: NVIDIA CORPORATION <cudatools@nvidia.com>
Installed-Size: 2,425 kB
Depends: nvidia-container-toolkit-base (= 1.13.5-1), libnvidia-container-tools (>= 1.13.5-1), libnvidia-container-tools (<< 2.0.0), libseccomp2
Breaks: nvidia-container-runtime (<= 3.5.0-1), nvidia-container-runtime-hook
Replaces: nvidia-container-runtime (<= 3.5.0-1), nvidia-container-runtime-hook
Homepage: https://github.com/NVIDIA/nvidia-container-toolkit
Download-Size: 853 kB
APT-Manual-Installed: yes
APT-Sources: https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64  Packages
Description: NVIDIA Container toolkit
 Provides tools and utilities to enable GPU support in containers.

and for Nvidia-Container-Runtime:

root@localhost:/opt# apt info nvidia-container-runtime
Package: nvidia-container-runtime
Version: 3.13.0-1
Priority: optional
Section: utils
Maintainer: NVIDIA CORPORATION <cudatools@nvidia.com>
Installed-Size: 21.5 kB
Depends: nvidia-container-toolkit (>= 1.13.0-1), nvidia-container-toolkit (<< 2.0.0)
Homepage: https://github.com/NVIDIA/nvidia-container-runtime/wiki
Download-Size: 4,988 B
APT-Manual-Installed: yes
APT-Sources: https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64  Packages
Description: NVIDIA container runtime
 Provides a modified version of runc allowing users to run GPU enabled
 containers.

The only driver which seems to come from the Ubuntu repo is my actual Nvidia driver:

root@localhost:/opt/# apt info nvidia-driver-535
Package: nvidia-driver-535
Version: 535.154.05-0ubuntu0.20.04.1
Priority: optional
Section: restricted/libs
Source: nvidia-graphics-drivers-535
Origin: Ubuntu
Maintainer: Ubuntu Core Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Support: PB
Phased-Update-Percentage: 30
Download-Size: 483 kB
APT-Manual-Installed: yes
APT-Sources: http://au.archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages
Description: NVIDIA driver metapackage
 This metapackage depends on the NVIDIA binary driver and on all of its libraries,
 to provide hardware acceleration for OpenGL/GLX/EGL/GLES/Vulkan
 applications on either X11 or on Wayland.

I can try direct and see if that makes a difference.

f0cker commented 9 months ago

Yeah try removing the driver and installing the latest from Nvidia: https://www.nvidia.com/Download/index.aspx

CarbonTrinity commented 9 months ago

Really hoped it would fix it.

Ive purged the Nvidia drivers,stopped and removed the docker containers. Downloaded and installed the 535.129.03 version successfully from official Nvidia site.

Rebooted to confirm its still seen in Ubuntu. Did local hashcat benchmark which worked perfectly. Reinstalled the containers, composed them afterwards.

The correct driver is showing:

Thu Jan 25 06:04:35 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 970         Off | 00000000:01:00.0 Off |                  N/A |
| 50%   32C    P8              14W / 250W |     15MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce GTX 970         Off | 00000000:02:00.0 Off |                  N/A |
| 51%   29C    P8              14W / 250W |      6MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   2  NVIDIA GeForce GTX 970         Off | 00000000:0B:00.0 Off |                  N/A |
| 50%   26C    P8              14W / 250W |      6MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

But when i try the benchmark:

root@localhost:/opt# sudo docker exec -it crackq /opt/crackq/build/pyhashcat/pyhashcat/hashcat/hashcat -b -m 1000
hashcat (v6.2.1) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

cuInit(): forward compatibility was attempted on non supported HW

clGetPlatformIDs(): CL_PLATFORM_NOT_FOUND_KHR

ATTENTION! No OpenCL-compatible or CUDA-compatible platform found.

You are probably missing the OpenCL or CUDA runtime installation.

* AMD GPUs on Linux require this driver:
  "RadeonOpenCompute (ROCm)" Software Platform (3.1 or later)
* Intel CPUs require this runtime:
  "OpenCL Runtime for Intel Core and Intel Xeon Processors" (16.1.1 or later)
* NVIDIA GPUs require this runtime and/or driver (both):
  "NVIDIA Driver" (440.64 or later)
  "CUDA Toolkit" (9.0 or later)

Started: Thu Jan 25 06:04:44 2024
Stopped: Thu Jan 25 06:04:44 2024

When run locally, it works:

root@localhost:/opt#  hashcat -b -m 100
hashcat (v5.1.0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

* Device #1: WARNING! Kernel exec timeout is not disabled.
             This may cause "CL_OUT_OF_RESOURCES" or related errors.
             To disable the timeout, see: https://hashcat.net/q/timeoutpatch
* Device #2: WARNING! Kernel exec timeout is not disabled.
             This may cause "CL_OUT_OF_RESOURCES" or related errors.
             To disable the timeout, see: https://hashcat.net/q/timeoutpatch
* Device #3: WARNING! Kernel exec timeout is not disabled.
             This may cause "CL_OUT_OF_RESOURCES" or related errors.
             To disable the timeout, see: https://hashcat.net/q/timeoutpatch

OpenCL Platform #1: NVIDIA Corporation
======================================
* Device #1: NVIDIA GeForce GTX 970, 1009/4036 MB allocatable, 13MCU
* Device #2: NVIDIA GeForce GTX 970, 1009/4036 MB allocatable, 13MCU
* Device #3: NVIDIA GeForce GTX 970, 1009/4036 MB allocatable, 13MCU

Checking drivers in container:

root@localhost:/opt# sudo docker exec -it crackq nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Nov__3_17:16:49_PDT_2023
Cuda compilation tools, release 12.3, V12.3.103
Build cuda_12.3.r12.3/compiler.33492891_0
f0cker commented 9 months ago

It's a shame I can't reproduce this to debug myself, I've tried 3 different Nvidia GPU models in AWS which all work fine. So I think it's specific to GTX/RTX models, but would be good to know if anyone else has been able to get these models working with the latest version of crackq. If you're willing to try, since it's an issue on the Nvidia/docker side, my next step would be trying different docker images to the one I've setup.

So, within this file: ./docker/nvidia/ubuntu/Dockerfile

On line 1 replace the image name with one of the following and rebuild and test, then try others if no good:

FROM nvidia/cuda:12.2.2-devel-ubuntu20.04 FROM nvidia/cuda:12.3.1-devel-ubuntu22.04 FROM nvidia/cuda:12.3.1-runtime-ubuntu22.04 FROM nvidia/cuda:12.2.2-runtime-ubuntu20.04

CarbonTrinity commented 9 months ago

Hey, so i had some time to debug today following your recommendations. I'm happy to say that you were spot on 🥇. Once i swapped over to nvidia/cuda:12.2.2-devel-ubuntu20.04, everything worked.

root@localhost:/opt/# sudo docker exec -it crackq /opt/crackq/build/pyhashcat/pyhashcat/hashcat/hashcat -b -m 1000
hashcat (v6.2.1) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

clGetPlatformIDs(): CL_PLATFORM_NOT_FOUND_KHR

* Device #1: WARNING! Kernel exec timeout is not disabled.
             This may cause "CL_OUT_OF_RESOURCES" or related errors.
             To disable the timeout, see: https://hashcat.net/q/timeoutpatch
* Device #2: WARNING! Kernel exec timeout is not disabled.
             This may cause "CL_OUT_OF_RESOURCES" or related errors.
             To disable the timeout, see: https://hashcat.net/q/timeoutpatch
* Device #3: WARNING! Kernel exec timeout is not disabled.
             This may cause "CL_OUT_OF_RESOURCES" or related errors.
             To disable the timeout, see: https://hashcat.net/q/timeoutpatch
CUDA API (CUDA 12.2)
====================
* Device #1: NVIDIA GeForce GTX 970, 3961/4036 MB, 13MCU
* Device #2: NVIDIA GeForce GTX 970, 3969/4036 MB, 13MCU
* Device #3: NVIDIA GeForce GTX 970, 3969/4036 MB, 13MCU

Benchmark relevant options:
===========================
* --optimized-kernel-enable

Hashmode: 1000 - NTLM

Checked it in the GUI: nICE

Thank you so much, I'm glad to be able to mark this as closed.

f0cker commented 9 months ago

That's awesome, thanks for taking the time to help me debug this. I'll rollback the image version and push an update soon.