trexminer / T-Rex

T-Rex NVIDIA GPU miner with web control monitoring page
2.64k stars 439 forks source link

nVidia 1060 GTX 3GB ETC mining issue - HiveOS #993

Open 34GL3s opened 2 years ago

34GL3s commented 2 years ago

Hi,

I have an issue with mining ETC on some of my 1060 3GB rigs. 1 rig with the same cards is working but the other rig has stopped showing some "Can't find nonce with device. Not enough memory". The rig configuration including software versions (OS version, nvidia driver, t-rex driver,..) are totally the same. is anybody else experiencing similar issue as I am?

Here is miner output:

20220112 19:38:25 GPU #2: intensity 22
20220112 19:38:25 etchash epoch: 238, diff: 5.00 G
20220112 19:38:25 GPU #1: intensity 22
20220112 19:38:26 GPU #5: intensity 22
20220112 19:38:26 GPU #6: intensity 22
20220112 19:38:27 GPU #7: intensity 22
20220112 19:38:27 GPU #4: intensity 22
20220112 19:38:27 GPU #3: intensity 22
20220112 19:38:28 GPU #0: intensity 22
20220112 19:38:32 GPU #6: generating DAG 2.86 GB for epoch 238(476) ...
20220112 19:38:32 GPU #3: generating DAG 2.86 GB for epoch 238(476) ...
20220112 19:38:32 GPU #5: generating DAG 2.86 GB for epoch 238(476) ...
20220112 19:38:32 GPU #7: generating DAG 2.86 GB for epoch 238(476) ...
20220112 19:38:32 GPU #1: generating DAG 2.86 GB for epoch 238(476) ...
20220112 19:38:32 GPU #4: generating DAG 2.86 GB for epoch 238(476) ...
20220112 19:38:32 GPU #0: generating DAG 2.86 GB for epoch 238(476) ...
20220112 19:38:32 GPU #2: generating DAG 2.86 GB for epoch 238(476) ...
20220112 19:38:41 GPU #0: DAG generated [crc: b62dcf9d, time: 9002 ms], memory left: 17.19 MB
20220112 19:38:41 GPU #7: DAG generated [crc: b62dcf9d, time: 9008 ms], memory left: 17.19 MB
20220112 19:38:41 GPU #6: DAG generated [crc: b62dcf9d, time: 9023 ms], memory left: 17.19 MB
20220112 19:38:42 GPU #5: DAG generated [crc: b62dcf9d, time: 10689 ms], memory left: 17.19 MB
20220112 19:38:42 GPU #3: DAG generated [crc: b62dcf9d, time: 10781 ms], memory left: 17.19 MB
20220112 19:38:42 GPU #1: DAG generated [crc: b62dcf9d, time: 10795 ms], memory left: 17.19 MB
20220112 19:38:43 GPU #2: DAG generated [crc: b62dcf9d, time: 10843 ms], memory left: 17.19 MB
20220112 19:38:43 GPU #4: DAG generated [crc: b62dcf9d, time: 10881 ms], memory left: 17.19 MB
20220112 19:38:58 GPU #0: using kernel #2
20220112 19:38:58 TREX: Can't find nonce with device [ID=0, GPU #0], cuda exception: CUDA_ERROR_OUT_OF_MEMORY
20220112 19:38:58 WARN: Miner is going to shutdown...
20220112 19:38:58 Main loop finished. Cleaning up resources...
20220112 19:38:58 ApiServer: stopped listening on 127.0.0.1:4059
trexminer commented 2 years ago

Probably Xorg is consuming more VRAM on your first card than on the rest. You can check that by running nvidia-smi when the GPUs are idle.

34GL3s commented 2 years ago

@trexminer could you please check output of nvidia-smi? It looks fine to me.

Wed Jan 12 20:02:03 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.86       Driver Version: 470.86       CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| 30%   46C    P0    29W /  70W |      9MiB /  3019MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  On   | 00000000:04:00.0 Off |                  N/A |
| 30%   41C    P0    30W /  70W |      9MiB /  3019MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA GeForce ...  On   | 00000000:05:00.0 Off |                  N/A |
| 30%   40C    P0    31W /  70W |      9MiB /  3019MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA GeForce ...  On   | 00000000:06:00.0 Off |                  N/A |
| 30%   46C    P0    35W /  70W |      9MiB /  3019MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   4  NVIDIA GeForce ...  On   | 00000000:07:00.0 Off |                  N/A |
| 30%   45C    P0    29W /  70W |      9MiB /  3019MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   5  NVIDIA GeForce ...  On   | 00000000:08:00.0 Off |                  N/A |
| 30%   40C    P0    29W /  70W |      9MiB /  3019MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   6  NVIDIA GeForce ...  On   | 00000000:09:00.0 Off |                  N/A |
| 30%   44C    P0    38W /  70W |      9MiB /  3019MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   7  NVIDIA GeForce ...  On   | 00000000:0A:00.0 Off |                  N/A |
| 30%   50C    P0    37W /  70W |      9MiB /  3019MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2129      G   /usr/lib/xorg/Xorg                  6MiB |
|    1   N/A  N/A      2129      G   /usr/lib/xorg/Xorg                  6MiB |
|    2   N/A  N/A      2129      G   /usr/lib/xorg/Xorg                  6MiB |
|    3   N/A  N/A      2129      G   /usr/lib/xorg/Xorg                  6MiB |
|    4   N/A  N/A      2129      G   /usr/lib/xorg/Xorg                  6MiB |
|    5   N/A  N/A      2129      G   /usr/lib/xorg/Xorg                  6MiB |
|    6   N/A  N/A      2129      G   /usr/lib/xorg/Xorg                  6MiB |
|    7   N/A  N/A      2129      G   /usr/lib/xorg/Xorg                  6MiB |
+-----------------------------------------------------------------------------+
trexminer commented 2 years ago

Yes, it looks fine. Perhaps there is enough space for DAG but not for CUDA kernels and GPU#0 was the first to hit this limit. I believe it's time to switch to another coin as DAG keeps growing.

34GL3s commented 2 years ago

Do you think changing GPU#0<->GPU#3 would make any difference?

trexminer commented 2 years ago

It shouldn't but I wouldn't bet my house on it :) Even if it did, it's only a matter of 1-2 weeks before you run out of VRAM because of DAG growth.

34GL3s commented 2 years ago

alright, thank you for your support :) will try to find another profitable coin.