ergoplatform / Autolykos-GPU-miner

CUDA-based GPU miner for Ergo (Autolykos algorithm)
40 stars 20 forks source link

My NVIDIA GPU is not recognized - deviceCount is 0 #60

Open bjenkinsgit opened 4 years ago

bjenkinsgit commented 4 years ago

A month or so ago, Autolykos miner compiled and ran. Now suddenly it doesn't work because the .cu files don't recognize any installed NVIDIA GPU. But when I install and make all of the CUDA examples, THOSE run just fine. Running on UBUNTU 18.04 with a GeForce TITAN X. For example, the utility "deviceQuery" returns the following: ./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX TITAN X" CUDA Driver Version / Runtime Version 10.1 / 10.1 CUDA Capability Major/Minor version number: 5.2 Total amount of global memory: 12210 MBytes (12802785280 bytes) (24) Multiprocessors, (128) CUDA Cores/MP: 3072 CUDA Cores GPU Max Clock rate: 1076 MHz (1.08 GHz) Memory Clock rate: 3505 Mhz Memory Bus Width: 384-bit L2 Cache Size: 3145728 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Compute Preemption: No Supports Cooperative Kernel Launch: No Supports MultiDevice Co-op Kernel Launch: No Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.1, NumDevs = 1 Result = PASS

rsmmnt commented 4 years ago

Hey, What error exactly does miner give you? Are you using latest version?

rsmmnt commented 4 years ago

also, do you have CUDA_VISIBLE_DEVICES env variable set?

bjenkinsgit commented 4 years ago

Ok. I’m a bit puzzled. It is now working. The miner binary now recognizes that I have 1 GPU and starts running. I did remake the CUDA toolkit and rebooted a few times since trying again. But at the time I left it, the miner was not running. It has now been a few days and it seems to be working again. I thought maybe it had something to do with the fact that some areas of the documentation mention that the config.json for the miner has to have the mnemonic phrase key listed as “mnemonic” and in other docs I’ve seen it as “seed”. But either entry allows the miner to run (which is the correct key, “mnemonic” or “seed” ?)

Now, from the miner, after a minute or so of processing, I am getting a lot of “ERROR: 500, REASON: INTERNAL ERROR” errors and the console for my ergo client (ergo-3.10.jar) is throwing lots of errors in the form of:

WARN [ctor.default-dispatcher-xxx] o.e.local.ErgoMiner - Failed to produce candidate block with a java error of: java.lang.Error: Trying to generate proof for empty transaction sequence….

Any ideas on what is happening now?

Thanks

Bart

On Oct 9, 2019, at 12:53 AM, rsmmnt notifications@github.com wrote:

Hey, What error exactly does miner give you? Are you using latest version?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ergoplatform/Autolykos-GPU-miner/issues/60?email_source=notifications&email_token=ACBCYQPW2K5D33KS2G4TF4DQNVPWDA5CNFSM4I5FXXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAWSGMY#issuecomment-539829043, or mute the thread https://github.com/notifications/unsubscribe-auth/ACBCYQIOBF4QSPGECQJTN2TQNVPWDANCNFSM4I5FXXTQ.

bjenkinsgit commented 4 years ago

I do not. The only CUDA related env variables I have set must be ones from making and installing the cuda libs and examples, specifically:

CUDA_BIN=/usr/local/cuda-10.1/bin CUDA_HOME=/usr/local/cuda-10.1 CUDA_NSIGHT=/usr/local/cuda-10.1/NsightCompute-2019.1

But, the miner binary is recognizing my GPU now. I’m getting errors from mining, but that is a different problem...

On Oct 9, 2019, at 12:54 AM, rsmmnt notifications@github.com wrote:

also, do you have CUDA_VISIBLE_DEVICES env variable set?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ergoplatform/Autolykos-GPU-miner/issues/60?email_source=notifications&email_token=ACBCYQM3K2D6E4Q4BX22WQTQNVPZBA5CNFSM4I5FXXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAWSHWQ#issuecomment-539829210, or mute the thread https://github.com/notifications/unsubscribe-auth/ACBCYQJTCSIGYOHHDOCTQC3QNVPZBANCNFSM4I5FXXTQ.

bjenkinsgit commented 4 years ago

I forgot to add that, in the miner error output, the DETAIL info on the 500 error says “requirement failed: Incorrect points"

On Oct 9, 2019, at 11:32 AM, Bart Jenkins bauhaus9@mac.com wrote:

Ok. I’m a bit puzzled. It is now working. The miner binary now recognizes that I have 1 GPU and starts running. I did remake the CUDA toolkit and rebooted a few times since trying again. But at the time I left it, the miner was not running. It has now been a few days and it seems to be working again. I thought maybe it had something to do with the fact that some areas of the documentation mention that the config.json for the miner has to have the mnemonic phrase key listed as “mnemonic” and in other docs I’ve seen it as “seed”. But either entry allows the miner to run (which is the correct key, “mnemonic” or “seed” ?)

Now, from the miner, after a minute or so of processing, I am getting a lot of “ERROR: 500, REASON: INTERNAL ERROR” errors and the console for my ergo client (ergo-3.10.jar) is throwing lots of errors in the form of:

WARN [ctor.default-dispatcher-xxx] o.e.local.ErgoMiner - Failed to produce candidate block with a java error of: java.lang.Error: Trying to generate proof for empty transaction sequence….

Any ideas on what is happening now?

Thanks

Bart

On Oct 9, 2019, at 12:53 AM, rsmmnt <notifications@github.com mailto:notifications@github.com> wrote:

Hey, What error exactly does miner give you? Are you using latest version?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ergoplatform/Autolykos-GPU-miner/issues/60?email_source=notifications&email_token=ACBCYQPW2K5D33KS2G4TF4DQNVPWDA5CNFSM4I5FXXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAWSGMY#issuecomment-539829043, or mute the thread https://github.com/notifications/unsubscribe-auth/ACBCYQIOBF4QSPGECQJTN2TQNVPWDANCNFSM4I5FXXTQ.

bjenkinsgit commented 4 years ago

I’ll bet what happened is this:

I play games on this linux box and I might have left a game up and running,

…OR...

When exiting the video game I was playing, it did not release the GPU for some reason

I’ll try to reproduce this by running a game, leaving it running and then trying to start the miner…I’ll report back if that is the problem.

Is there some CUDA command (for the .cu file) to check for a GPU being “in-use” rather than just saying NO DEVICES FOUND? That would be a more meaningful error, no?

Thanks

On Oct 9, 2019, at 11:32 AM, Bart Jenkins bauhaus9@mac.com wrote:

Ok. I’m a bit puzzled. It is now working. The miner binary now recognizes that I have 1 GPU and starts running. I did remake the CUDA toolkit and rebooted a few times since trying again. But at the time I left it, the miner was not running. It has now been a few days and it seems to be working again. I thought maybe it had something to do with the fact that some areas of the documentation mention that the config.json for the miner has to have the mnemonic phrase key listed as “mnemonic” and in other docs I’ve seen it as “seed”. But either entry allows the miner to run (which is the correct key, “mnemonic” or “seed” ?)

Now, from the miner, after a minute or so of processing, I am getting a lot of “ERROR: 500, REASON: INTERNAL ERROR” errors and the console for my ergo client (ergo-3.10.jar) is throwing lots of errors in the form of:

WARN [ctor.default-dispatcher-xxx] o.e.local.ErgoMiner - Failed to produce candidate block with a java error of: java.lang.Error: Trying to generate proof for empty transaction sequence….

Any ideas on what is happening now?

Thanks

Bart

On Oct 9, 2019, at 12:53 AM, rsmmnt <notifications@github.com mailto:notifications@github.com> wrote:

Hey, What error exactly does miner give you? Are you using latest version?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ergoplatform/Autolykos-GPU-miner/issues/60?email_source=notifications&email_token=ACBCYQPW2K5D33KS2G4TF4DQNVPWDA5CNFSM4I5FXXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAWSGMY#issuecomment-539829043, or mute the thread https://github.com/notifications/unsubscribe-auth/ACBCYQIOBF4QSPGECQJTN2TQNVPWDANCNFSM4I5FXXTQ.

bjenkinsgit commented 4 years ago

Another update:

In the miner’s ‘config.json’ file, I changed the “keepPrehash” value from “true” to “false” and the miner seems to be stable now and no more errors in the ergo client console window.

So, although I technically have a GPU that has 12 Gbytes of memory, and therefore SHOULD be able to set “keepPrehash” to “true”, I think that either:

a. that feature requires a more modern GPU or b. My little Xeon 3 CPU can’t handle the data from processing data at 12 Gbytes from the GPU ?? (total guess here. That is, imagine I could suddenly drop in a more powerful CPU, keeping all else the same, would the miner and the ergo client be able to stay in sync?)

thanks for responding….I had given up hope there...

On Oct 9, 2019, at 11:50 AM, Bart Jenkins bauhaus9@mac.com wrote:

I’ll bet what happened is this:

I play games on this linux box and I might have left a game up and running,

…OR...

When exiting the video game I was playing, it did not release the GPU for some reason

I’ll try to reproduce this by running a game, leaving it running and then trying to start the miner…I’ll report back if that is the problem.

Is there some CUDA command (for the .cu file) to check for a GPU being “in-use” rather than just saying NO DEVICES FOUND? That would be a more meaningful error, no?

Thanks

On Oct 9, 2019, at 11:32 AM, Bart Jenkins <bauhaus9@mac.com mailto:bauhaus9@mac.com> wrote:

Ok. I’m a bit puzzled. It is now working. The miner binary now recognizes that I have 1 GPU and starts running. I did remake the CUDA toolkit and rebooted a few times since trying again. But at the time I left it, the miner was not running. It has now been a few days and it seems to be working again. I thought maybe it had something to do with the fact that some areas of the documentation mention that the config.json for the miner has to have the mnemonic phrase key listed as “mnemonic” and in other docs I’ve seen it as “seed”. But either entry allows the miner to run (which is the correct key, “mnemonic” or “seed” ?)

Now, from the miner, after a minute or so of processing, I am getting a lot of “ERROR: 500, REASON: INTERNAL ERROR” errors and the console for my ergo client (ergo-3.10.jar) is throwing lots of errors in the form of:

WARN [ctor.default-dispatcher-xxx] o.e.local.ErgoMiner - Failed to produce candidate block with a java error of: java.lang.Error: Trying to generate proof for empty transaction sequence….

Any ideas on what is happening now?

Thanks

Bart

On Oct 9, 2019, at 12:53 AM, rsmmnt <notifications@github.com mailto:notifications@github.com> wrote:

Hey, What error exactly does miner give you? Are you using latest version?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ergoplatform/Autolykos-GPU-miner/issues/60?email_source=notifications&email_token=ACBCYQPW2K5D33KS2G4TF4DQNVPWDA5CNFSM4I5FXXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAWSGMY#issuecomment-539829043, or mute the thread https://github.com/notifications/unsubscribe-auth/ACBCYQIOBF4QSPGECQJTN2TQNVPWDANCNFSM4I5FXXTQ.

rsmmnt commented 4 years ago

1) Miner log is not your node log, it is written separately, please copy it here.

2)

the DETAIL info on the 500 error says “requirement failed: Incorrect points" This means that miner found a solution for a wrong block data - check if your node is in sync (via http info vs block explorer)

WARN [ctor.default-dispatcher-xxx] o.e.local.ErgoMiner - Failed to produce candidate block with a java error of: java.lang.Error: Trying to generate proof for empty transaction sequence….

This also signals a node problem, not a miner one.

3) you can check gpu memory load via nvidia-smi tool