cbuchner1 / CudaMiner

a CUDA accelerated litecoin mining application based on pooler's CPU miner
Other
688 stars 303 forks source link

new rig, cudaminer runs fine for 1 min then explodes #99

Closed tunage closed 10 years ago

tunage commented 10 years ago

I have twin tesla k10s on and ASUS z87-plus board. No obvious errors. I start cudaminer with -> cudaminer -H 2 -i 0 -l auto -C 1 -o stratum+tcp://mining.updamoon.com:9008 -O tunae.tunae:1 and after a few seconds I get;

[2014-02-16 12:31:25] GPU #0: Tesla K10.G1.8GB, 164.70 khash/s [2014-02-16 12:31:25] GPU #2: Tesla K10.G1.8GB, 121.44 khash/s [2014-02-16 12:31:25] GPU #1: Tesla K10.G1.8GB, 135.55 khash/s [2014-02-16 12:31:25] GPU #3: Tesla K10.G1.8GB, 180.60 khash/s [2014-02-16 12:31:26] accepted: 1/1 (100.00%), 602.29 khash/s (yay!!!) [2014-02-16 12:31:28] GPU #1: Tesla K10.G1.8GB, 244.07 khash/s [2014-02-16 12:31:28] accepted: 2/2 (100.00%), 710.81 khash/s (yay!!!) [2014-02-16 12:31:28] GPU #3: Tesla K10.G1.8GB, 218.36 khash/s [2014-02-16 12:31:28] accepted: 3/3 (100.00%), 748.57 khash/s (yay!!!)

And we think we are rolling!

Then, after about a minute, i get; [2014-02-16 12:33:16] GPU #1: cudaError 30 (unknown error) calling 'cudaMemcpyAsync(context_idata[stream][thr_id], X, mem_size, cudaMemcpyHostToDevice, context_streams[stream][thr_id])' (salsa_kernel.cu line 923)

[2014-02-16 12:33:16] GPU #1: cudaError 30 (unknown error) calling 'cudaStreamWaitEvent(context_streams[stream][thr_id], context_serialize[(stream+1)&1][thr_id], 0)' (salsa_kernel.cu line 931)

[2014-02-16 12:33:16] GPU #1: cudaError 30 (unknown error) calling 'cudaEventRecord(context_serialize[stream][thr_id], context_streams[stream][thr_id])' (salsa_kernel.cu line 937)

And I have to reboot to recover.

Any thoughts?

cbuchner1 commented 10 years ago

"unknown error", hmm... any overclocking going on? are the cards otherwise stable when unter permanent load? cudaminer puts a lot of stress on them, you know...

Christian

2014-02-16 21:25 GMT+01:00 tunage notifications@github.com:

I have twin tesla k10s on and ASUS z87-plus board. No obvious errors. I start cudaminer with -> cudaminer -H 2 -i 0 -l auto -C 1 -o stratum+tcp:// mining.updamoon.com:9008 -O tunae.tunae:1 and after a few seconds I get;

[2014-02-16 12:31:25] GPU #0: Tesla K10.G1.8GB, 164.70 khash/s [2014-02-16 12:31:25] GPU #2https://github.com/cbuchner1/CudaMiner/issues/2: Tesla K10.G1.8GB, 121.44 khash/s [2014-02-16 12:31:25] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: Tesla K10.G1.8GB, 135.55 khash/s [2014-02-16 12:31:25] GPU #3https://github.com/cbuchner1/CudaMiner/issues/3: Tesla K10.G1.8GB, 180.60 khash/s [2014-02-16 12:31:26] accepted: 1/1 (100.00%), 602.29 khash/s (yay!!!) [2014-02-16 12:31:28] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: Tesla K10.G1.8GB, 244.07 khash/s [2014-02-16 12:31:28] accepted: 2/2 (100.00%), 710.81 khash/s (yay!!!) [2014-02-16 12:31:28] GPU #3https://github.com/cbuchner1/CudaMiner/issues/3: Tesla K10.G1.8GB, 218.36 khash/s [2014-02-16 12:31:28] accepted: 3/3 (100.00%), 748.57 khash/s (yay!!!)

And we think we are rolling!

Then, after about a minute, i get; [2014-02-16 12:33:16] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: cudaError 30 (unknown error) calling 'cudaMemcpyAsync(context_idata[stream][thr_id], X, mem_size, cudaMemcpyHostToDevice, context_streams[stream][thr_id])' (salsa_kernel.culine 923)

[2014-02-16 12:33:16] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: cudaError 30 (unknown error) calling 'cudaStreamWaitEvent(context_streams[stream][thr_id], context_serialize[(stream+1)&1][thr_id], 0)' (salsa_kernel.cu line 931)

[2014-02-16 12:33:16] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: cudaError 30 (unknown error) calling 'cudaEventRecord(context_serialize[stream][thr_id], context_streams[stream][thr_id])' (salsa_kernel.cu line 937)

And I have to reboot to recover.

Any thoughts?

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/99 .

tunage commented 10 years ago

No sir,

Straight out the box.

Using only GNU.

I pulled 2 cards out and now down to only one and watching.

ASUS motherboard z87-plus with plenty of cooling and 1500watt power supply.

No clue. Fresh everything.

Brad

From: Christian Buchner [mailto:notifications@github.com] Sent: Sunday, February 16, 2014 4:07 PM To: cbuchner1/CudaMiner Cc: tunage Subject: Re: [CudaMiner] new rig, cudaminer runs fine for 1 min then explodes (#99)

"unknown error", hmm... any overclocking going on? are the cards otherwise stable when unter permanent load? cudaminer puts a lot of stress on them, you know...

Christian

2014-02-16 21:25 GMT+01:00 tunage notifications@github.com:

I have twin tesla k10s on and ASUS z87-plus board. No obvious errors. I start cudaminer with -> cudaminer -H 2 -i 0 -l auto -C 1 -o stratum+tcp:// mining.updamoon.com:9008 -O tunae.tunae:1 and after a few seconds I get;

[2014-02-16 12:31:25] GPU #0: Tesla K10.G1.8GB, 164.70 khash/s [2014-02-16 12:31:25] GPU #2https://github.com/cbuchner1/CudaMiner/issues/2: Tesla K10.G1.8GB, 121.44 khash/s [2014-02-16 12:31:25] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: Tesla K10.G1.8GB, 135.55 khash/s [2014-02-16 12:31:25] GPU #3https://github.com/cbuchner1/CudaMiner/issues/3: Tesla K10.G1.8GB, 180.60 khash/s [2014-02-16 12:31:26] accepted: 1/1 (100.00%), 602.29 khash/s (yay!!!) [2014-02-16 12:31:28] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: Tesla K10.G1.8GB, 244.07 khash/s [2014-02-16 12:31:28] accepted: 2/2 (100.00%), 710.81 khash/s (yay!!!) [2014-02-16 12:31:28] GPU #3https://github.com/cbuchner1/CudaMiner/issues/3: Tesla K10.G1.8GB, 218.36 khash/s [2014-02-16 12:31:28] accepted: 3/3 (100.00%), 748.57 khash/s (yay!!!)

And we think we are rolling!

Then, after about a minute, i get; [2014-02-16 12:33:16] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: cudaError 30 (unknown error) calling 'cudaMemcpyAsync(context_idata[stream][thr_id], X, mem_size, cudaMemcpyHostToDevice, context_streams[stream][thr_id])' (salsa_kernel.culine 923)

[2014-02-16 12:33:16] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: cudaError 30 (unknown error) calling 'cudaStreamWaitEvent(context_streams[stream][thr_id], context_serialize[(stream+1)&1][thr_id], 0)' (salsa_kernel.cu line 931)

[2014-02-16 12:33:16] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: cudaError 30 (unknown error) calling 'cudaEventRecord(context_serialize[stream][thr_id], context_streams[stream][thr_id])' (salsa_kernel.cu line 937)

And I have to reboot to recover.

Any thoughts?

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/99 .

— Reply to this email directly or view it on GitHub https://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-35212529 . https://github.com/notifications/beacon/6509968__eyJzY29wZSI6Ik5ld3NpZXM6QmVhY29uIiwiZXhwaXJlcyI6MTcwODExNzYzNiwiZGF0YSI6eyJpZCI6MjU3NzkxNDV9fQ==--cd896821480568fa763f70d3ac3f76a7925bde54.gif

tunage commented 10 years ago

Same with one card.

L

From: Christian Buchner [mailto:notifications@github.com] Sent: Sunday, February 16, 2014 4:07 PM To: cbuchner1/CudaMiner Cc: tunage Subject: Re: [CudaMiner] new rig, cudaminer runs fine for 1 min then explodes (#99)

"unknown error", hmm... any overclocking going on? are the cards otherwise stable when unter permanent load? cudaminer puts a lot of stress on them, you know...

Christian

2014-02-16 21:25 GMT+01:00 tunage notifications@github.com:

I have twin tesla k10s on and ASUS z87-plus board. No obvious errors. I start cudaminer with -> cudaminer -H 2 -i 0 -l auto -C 1 -o stratum+tcp:// mining.updamoon.com:9008 -O tunae.tunae:1 and after a few seconds I get;

[2014-02-16 12:31:25] GPU #0: Tesla K10.G1.8GB, 164.70 khash/s [2014-02-16 12:31:25] GPU #2https://github.com/cbuchner1/CudaMiner/issues/2: Tesla K10.G1.8GB, 121.44 khash/s [2014-02-16 12:31:25] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: Tesla K10.G1.8GB, 135.55 khash/s [2014-02-16 12:31:25] GPU #3https://github.com/cbuchner1/CudaMiner/issues/3: Tesla K10.G1.8GB, 180.60 khash/s [2014-02-16 12:31:26] accepted: 1/1 (100.00%), 602.29 khash/s (yay!!!) [2014-02-16 12:31:28] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: Tesla K10.G1.8GB, 244.07 khash/s [2014-02-16 12:31:28] accepted: 2/2 (100.00%), 710.81 khash/s (yay!!!) [2014-02-16 12:31:28] GPU #3https://github.com/cbuchner1/CudaMiner/issues/3: Tesla K10.G1.8GB, 218.36 khash/s [2014-02-16 12:31:28] accepted: 3/3 (100.00%), 748.57 khash/s (yay!!!)

And we think we are rolling!

Then, after about a minute, i get; [2014-02-16 12:33:16] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: cudaError 30 (unknown error) calling 'cudaMemcpyAsync(context_idata[stream][thr_id], X, mem_size, cudaMemcpyHostToDevice, context_streams[stream][thr_id])' (salsa_kernel.culine 923)

[2014-02-16 12:33:16] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: cudaError 30 (unknown error) calling 'cudaStreamWaitEvent(context_streams[stream][thr_id], context_serialize[(stream+1)&1][thr_id], 0)' (salsa_kernel.cu line 931)

[2014-02-16 12:33:16] GPU #1https://github.com/cbuchner1/CudaMiner/issues/1: cudaError 30 (unknown error) calling 'cudaEventRecord(context_serialize[stream][thr_id], context_streams[stream][thr_id])' (salsa_kernel.cu line 937)

And I have to reboot to recover.

Any thoughts?

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/99 .

— Reply to this email directly or view it on GitHub https://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-35212529 . https://github.com/notifications/beacon/6509968__eyJzY29wZSI6Ik5ld3NpZXM6QmVhY29uIiwiZXhwaXJlcyI6MTcwODExNzYzNiwiZGF0YSI6eyJpZCI6MjU3NzkxNDV9fQ==--cd896821480568fa763f70d3ac3f76a7925bde54.gif

tunage commented 10 years ago

this is a brand new install of Ubuntu 12.04 built just for this card.

tunage commented 10 years ago

The card is good to go by every measure I know;

root@coined:~# nvidia-smi -a

==============NVSMI LOG==============

Timestamp : Sun Feb 16 16:26:18 2014 Driver Version : 331.38

Attached GPUs : 2 GPU 0000:03:00.0 Product Name : Tesla K10.G1.8GB Display Mode : Disabled Display Active : Disabled Persistence Mode : Disabled Accounting Mode : Disabled Accounting Mode Buffer Size : 128 Driver Model Current : N/A Pending : N/A Serial Number : 0321613047527 GPU UUID : GPU-10eaa126-fafa-3c05-13f9-f371c9a0c232 Minor Number : 0 VBIOS Version : 80.04.59.00.2B Inforom Version Image Version : 2055.0200.01.06 OEM Object : 1.1 ECC Object : 2.0 Power Management Object : N/A GPU Operation Mode Current : N/A Pending : N/A PCI Bus : 0x03 Device : 0x00 Domain : 0x0000 Device Id : 0x118F10DE Bus Id : 0000:03:00.0 Sub System Id : 0x097010DE GPU Link Info PCIe Generation Max : 3 Current : 3 Link Width Max : 16x Current : 16x Bridge Chip Type : PLX Firmware : 0 Fan Speed : N/A Performance State : P0 Clocks Throttle Reasons Idle : Not Active Applications Clocks Setting : Not Active SW Power Cap : Not Active HW Slowdown : Not Active Unknown : Not Active FB Memory Usage Total : 3583 MiB Used : 9 MiB Free : 3574 MiB BAR1 Memory Usage Total : 256 MiB Used : 2 MiB Free : 254 MiB Compute Mode : Default Utilization Gpu : 0 % Memory : 0 % Ecc Mode Current : Enabled Pending : Enabled ECC Errors Volatile Single Bit Device Memory : 0 Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Total : 0 Double Bit Device Memory : 0 Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Total : 0 Aggregate Single Bit Device Memory : 0 Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Total : 0 Double Bit Device Memory : 0 Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Total : 0 Retired Pages Single Bit ECC : N/A Double Bit ECC : N/A Pending : N/A Temperature Gpu : 54 C Power Readings Power Management : Supported Power Draw : 42.60 W Power Limit : 117.50 W Default Power Limit : 117.50 W Enforced Power Limit : 117.50 W Min Power Limit : 85.00 W Max Power Limit : 125.00 W Clocks Graphics : 745 MHz SM : 745 MHz Memory : 2500 MHz Applications Clocks Graphics : 745 MHz Memory : 2500 MHz Default Applications Clocks Graphics : 745 MHz Memory : 2500 MHz Max Clocks Graphics : 745 MHz SM : 745 MHz Memory : 2500 MHz Compute Processes : None

GPU 0000:04:00.0 Product Name : Tesla K10.G1.8GB Display Mode : Disabled Display Active : Disabled Persistence Mode : Disabled Accounting Mode : Disabled Accounting Mode Buffer Size : 128 Driver Model Current : N/A Pending : N/A Serial Number : 0321613047527 GPU UUID : GPU-6db4f28c-7ddf-baf6-3d29-0d9c338dac7c Minor Number : 1 VBIOS Version : 80.04.59.00.2C Inforom Version Image Version : 2055.0200.01.06 OEM Object : 1.1 ECC Object : 2.0 Power Management Object : N/A GPU Operation Mode Current : N/A Pending : N/A PCI Bus : 0x04 Device : 0x00 Domain : 0x0000 Device Id : 0x118F10DE Bus Id : 0000:04:00.0 Sub System Id : 0x097010DE GPU Link Info PCIe Generation Max : 3 Current : 3 Link Width Max : 16x Current : 16x Bridge Chip Type : PLX Firmware : 0 Fan Speed : N/A Performance State : P0 Clocks Throttle Reasons Idle : Not Active Applications Clocks Setting : Not Active SW Power Cap : Not Active HW Slowdown : Not Active Unknown : Not Active FB Memory Usage Total : 3583 MiB Used : 9 MiB Free : 3574 MiB BAR1 Memory Usage Total : 256 MiB Used : 2 MiB Free : 254 MiB Compute Mode : Default Utilization Gpu : 0 % Memory : 0 % Ecc Mode Current : Enabled Pending : Enabled ECC Errors Volatile Single Bit Device Memory : 0 Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Total : 0 Double Bit Device Memory : 0 Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Total : 0 Aggregate Single Bit Device Memory : 0 Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Total : 0 Double Bit Device Memory : 0 Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Total : 0 Retired Pages Single Bit ECC : N/A Double Bit ECC : N/A Pending : N/A Temperature Gpu : 45 C Power Readings Power Management : Supported Power Draw : 36.72 W Power Limit : 117.50 W Default Power Limit : 117.50 W Enforced Power Limit : 117.50 W Min Power Limit : 85.00 W Max Power Limit : 125.00 W Clocks Graphics : 745 MHz SM : 745 MHz Memory : 2500 MHz Applications Clocks Graphics : 745 MHz Memory : 2500 MHz Default Applications Clocks Graphics : 745 MHz Memory : 2500 MHz Max Clocks Graphics : 745 MHz SM : 745 MHz Memory : 2500 MHz Compute Processes : None

root@coined:~# cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX x86_64 Kernel Module 331.38 Wed Jan 8 19:32:30 PST 2014 GCC version: gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)

tunage commented 10 years ago

Other diags: root@coined:~# cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX x86_64 Kernel Module 331.38 Wed Jan 8 19:32:30 PST 2014 GCC version: gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)

root@coined:~# dmesg |grep NVRM [ 6.523205] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 331.38 Wed Jan 8 19:32:30 PST 2014

everything looks solid on the device end.

I don't know why it borks mid way. seems like a driver issue. Everything is current/updated on my end.

tunage commented 10 years ago

I have 1500watts power too. no need to go there.

kristianfreeman commented 10 years ago

I think this is really similar to the issues I was experiencing (#82) – I've just ended up running with the one card I have that apparently works. Not sure what the solution is.

tunage commented 10 years ago

I have 8GB of horse power, I shouldn't be failing like this http://i.imgur.com/nDuHH41.png

cbuchner1 commented 10 years ago

as a workaround, set a --time-limit of 60 seconds and run this in a loop from a shell script file.

You'll lose some performance due to overhead of reconnecting to the stratum pool.

Christian

2014-02-16 23:05 GMT+01:00 tunage notifications@github.com:

I have 8GB of horse power, I shouldn't be failing like this http://i.imgur.com/nDuHH41.png

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-35217247 .

cbuchner1 commented 10 years ago

oh, the reboot requirement to recover makes my suggestion moot...

try a different distro, maybe?

Christian

2014-02-16 23:09 GMT+01:00 Christian Buchner christian.buchner@gmail.com:

as a workaround, set a --time-limit of 60 seconds and run this in a loop from a shell script file.

You'll lose some performance due to overhead of reconnecting to the stratum pool.

Christian

2014-02-16 23:05 GMT+01:00 tunage notifications@github.com:

I have 8GB of horse power, I shouldn't be failing like this

http://i.imgur.com/nDuHH41.png

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-35217247 .

tunage commented 10 years ago

@cbuchner1 I am a n00b, but not new. Where?

tunage commented 10 years ago

Gentoo borked on a keyboard error. I love gentoo, but it got pounded quick. I went to ubuntu, go almost there, it borked, went to suse, it borked. Back at ubuntu. J

From: Christian Buchner [mailto:notifications@github.com] Sent: Sunday, February 16, 2014 5:11 PM To: cbuchner1/CudaMiner Cc: tunage Subject: Re: [CudaMiner] new rig, cudaminer runs fine for 1 min then explodes (#99)

oh, the reboot requirement to recover makes my suggestion moot...

try a different distro, maybe?

Christian

2014-02-16 23:09 GMT+01:00 Christian Buchner christian.buchner@gmail.com:

as a workaround, set a --time-limit of 60 seconds and run this in a loop from a shell script file.

You'll lose some performance due to overhead of reconnecting to the stratum pool.

Christian

2014-02-16 23:05 GMT+01:00 tunage notifications@github.com:

I have 8GB of horse power, I shouldn't be failing like this

http://i.imgur.com/nDuHH41.png

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-35217247 .

— Reply to this email directly or view it on GitHub https://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-35217657 . https://github.com/notifications/beacon/6509968__eyJzY29wZSI6Ik5ld3NpZXM6QmVhY29uIiwiZXhwaXJlcyI6MTcwODEyMTQ1MCwiZGF0YSI6eyJpZCI6MjU3NzkxNDV9fQ==--edde63167fe63f1948bbe0a8d77e6ea5c3a01e44.gif

tunage commented 10 years ago

Gentoo, SUSE and Ubuntu ain't enough distros for you?

tunage commented 10 years ago

I guess we could set up a merry go round. Not sure what help it would provide.

tunage commented 10 years ago

I just loaded Ubuntu 13.10 Same exact error. Same exact scenario.

root@coined:~# cat /etc/*-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=13.10 DISTRIB_CODENAME=saucy DISTRIB_DESCRIPTION="Ubuntu 13.10" NAME="Ubuntu" VERSION="13.10, Saucy Salamander" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 13.10" VERSION_ID="13.10" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"

[2014-02-16 22:44:36] GPU #0: cudaError 33 (invalid resource handle) calling 'cudaStreamWaitEvent(context_streams[stream][thr_id], context_serialize[(stream+1)&1][thr_id], 0)' (salsa_kernel.cu line 931)

[2014-02-16 22:44:36] GPU #0: cudaError 30 (unknown error) calling 'cudaMemcpyAsync(hash, context_hash[stream][thr_id], mem_size, cudaMemcpyDeviceToHost, context_streams[stream][thr_id])' (sha256.cu line 446)

[2014-02-16 22:44:36] GPU #0: cudaError 33 (invalid resource handle) calling 'cudaEventRecord(context_serialize[stream][thr_id], context_streams[stream][thr_id])' (salsa_kernel.cu line 937)

cbuchner1 commented 10 years ago

Do you get some kernel debug or syslog messages that indicate "GPU has fallen off the bus", as described here?

http://www.cyberciti.biz/faq/debian-ubuntu-rhel-fedora-linux-nvidia-nvrm-gpu-fallen-off-bus/

Christian

2014-02-17 4:53 GMT+01:00 tunage notifications@github.com:

I just loaded Ubuntu 13.10 Same exact error. Same exact scenario.

root@coined:~# cat /etc/*-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=13.10 DISTRIB_CODENAME=saucy DISTRIB_DESCRIPTION="Ubuntu 13.10" NAME="Ubuntu" VERSION="13.10, Saucy Salamander" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 13.10" VERSION_ID="13.10" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"

[2014-02-16 22:44:36] GPU #0: cudaError 33 (invalid resource handle) calling 'cudaStreamWaitEvent(context_streams[stream][thr_id], context_serialize[(stream+1)&1][thr_id], 0)' (salsa_kernel.cu line 931)

[2014-02-16 22:44:36] GPU #0: cudaError 30 (unknown error) calling 'cudaMemcpyAsync(hash, context_hash[stream][thr_id], mem_size, cudaMemcpyDeviceToHost, context_streams[stream][thr_id])' (sha256.cu line 446)

[2014-02-16 22:44:36] GPU #0: cudaError 33 (invalid resource handle) calling 'cudaEventRecord(context_serialize[stream][thr_id], context_streams[stream][thr_id])' (salsa_kernel.cu line 937)

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-35227141 .

cbuchner1 commented 10 years ago

I guess we could set up a merry go round. Not sure what help it would provide.

Not sure how to take this.

I have a day job, I have a family and kid. I do not see why I should be required to provide any more help than a merry go round would.

If I was charging for cudaminer of if we had agreed to enter a support contract then I might actually be more helpful. But even then: I don't have access to any Tesla cards.

Take this directly to nVidia. They have support channels for their commercial Tesla products. All they usually required is a repro case - just point them to the cudaminer github repository and describe your hardware setup and the fault you get.

Christian

2014-02-17 1:55 GMT+01:00 tunage notifications@github.com:

I guess we could set up a merry go round. Not sure what help it would provide.

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-35221913 .

cbuchner1 commented 10 years ago

I've got one more idea: If the BIOS would allow you to override the PCI Express generation settings, e.g downgrading from PCI Express 3.0 to PCI Express 2.0 (Gen3 to Gen2) then try this. The impact of bus speed on mining is low (even more so with the -H 2 flag). Also getting the latest BIOS from the mainboard manufacturer might not hurt.

Christian

2014-02-17 12:51 GMT+01:00 Christian Buchner christian.buchner@gmail.com:

I guess we could set up a merry go round. Not sure what help it would provide.

Not sure how to take this.

I have a day job, I have a family and kid. I do not see why I should be required to provide any more help than a merry go round would.

If I was charging for cudaminer of if we had agreed to enter a support contract then I might actually be more helpful. But even then: I don't have access to any Tesla cards.

Take this directly to nVidia. They have support channels for their commercial Tesla products. All they usually required is a repro case - just point them to the cudaminer github repository and describe your hardware setup and the fault you get.

Christian

2014-02-17 1:55 GMT+01:00 tunage notifications@github.com:

I guess we could set up a merry go round. Not sure what help it would

provide.

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-35221913 .

danryan commented 10 years ago

I ran into the same issue as @tunage today, when connecting a card to a 16x gen3 slot via a 1x riser. The suggestion to downgrade from gen3 to gen2 resolved the error for me.

@cbuchner1, thank you so much for all of the effort you've put into building this project. Do you have a BTC address to where I can send a donation? I feel it necessary to support your work with more than just words :)

cbuchner1 commented 10 years ago

all my various donation addresses are given in the the cudaminer README.txt file near the top.

Thanks!

Christian

2014-03-11 5:24 GMT+01:00 Dan Ryan notifications@github.com:

I ran into the same issue as @tunage https://github.com/tunage today, when connecting a card to a 16x gen3 slot via a 1x riser. The suggestion to downgrade from gen3 to gen2 resolved the error for me.

@cbuchner1 https://github.com/cbuchner1, thank you so much for all of the effort you've put into building this project. Do you have a BTC address to where I can send a donation? I feel it necessary to support your work with more than just words :)

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-37262200 .

tunage commented 10 years ago

I feel the same way

1FBmc3rbDyGHKPiQUP1dVDKGcd7PGHFBL9

Here is my address

The fact is I had to hack a cron job to get it to work.

Total misery. I have a new round of MS boards to test against.

Everything about this post is a bomb until further notice.

This issue is NOT resolved.

Brad Sumrall

From: Dan Ryan [mailto:notifications@github.com] Sent: Tuesday, March 11, 2014 12:24 AM To: cbuchner1/CudaMiner Cc: tunage Subject: Re: [CudaMiner] new rig, cudaminer runs fine for 1 min then explodes (#99)

I ran into the same issue as @tunage https://github.com/tunage today, when connecting a card to a 16x gen3 slot via a 1x riser. The suggestion to downgrade from gen3 to gen2 resolved the error for me.

@cbuchner1 https://github.com/cbuchner1 , thank you so much for all of the effort you've put into building this project. Do you have a BTC address to where I can send a donation? I feel it necessary to support your work with more than just words :)

— Reply to this email directly or view it on GitHub https://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-37262200 . https://github.com/notifications/beacon/6509968__eyJzY29wZSI6Ik5ld3NpZXM6QmVhY29uIiwiZXhwaXJlcyI6MTcxMDEzMTA2OCwiZGF0YSI6eyJpZCI6MjU3NzkxNDV9fQ==--975ec965a82002589ba267589216e31c11a5e148.gif

tunage commented 10 years ago

README fails

From: Christian Buchner [mailto:notifications@github.com] Sent: Tuesday, March 11, 2014 3:46 AM To: cbuchner1/CudaMiner Cc: tunage Subject: Re: [CudaMiner] new rig, cudaminer runs fine for 1 min then explodes (#99)

all my various donation addresses are given in the the cudaminer README.txt file near the top.

Thanks!

Christian

2014-03-11 5:24 GMT+01:00 Dan Ryan notifications@github.com:

I ran into the same issue as @tunage https://github.com/tunage today, when connecting a card to a 16x gen3 slot via a 1x riser. The suggestion to downgrade from gen3 to gen2 resolved the error for me.

@cbuchner1 https://github.com/cbuchner1, thank you so much for all of the effort you've put into building this project. Do you have a BTC address to where I can send a donation? I feel it necessary to support your work with more than just words :)

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-37262200 .

— Reply to this email directly or view it on GitHub https://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-37270305 . https://github.com/notifications/beacon/6509968__eyJzY29wZSI6Ik5ld3NpZXM6QmVhY29uIiwiZXhwaXJlcyI6MTcxMDE0MzE3MiwiZGF0YSI6eyJpZCI6MjU3NzkxNDV9fQ==--4a58178f8d23bdad2cc3c29af3a6c460a38ea3a3.gif

cbuchner1 commented 10 years ago

This issue is NOT resolved.

I am not a mainboard mechanic.I cannot do anything about it with software modifications.

2014-03-11 9:25 GMT+01:00 tunage notifications@github.com:

README fails

From: Christian Buchner [mailto:notifications@github.com] Sent: Tuesday, March 11, 2014 3:46 AM To: cbuchner1/CudaMiner Cc: tunage Subject: Re: [CudaMiner] new rig, cudaminer runs fine for 1 min then explodes (#99)

all my various donation addresses are given in the the cudaminer README.txt file near the top.

Thanks!

Christian

2014-03-11 5:24 GMT+01:00 Dan Ryan notifications@github.com:

I ran into the same issue as @tunage https://github.com/tunage today, when connecting a card to a 16x gen3 slot via a 1x riser. The suggestion to downgrade from gen3 to gen2 resolved the error for me.

@cbuchner1 https://github.com/cbuchner1, thank you so much for all of the effort you've put into building this project. Do you have a BTC address to where I can send a donation? I feel it necessary to support your work with more than just words :)

Reply to this email directly or view it on GitHub< https://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-37262200> .

Reply to this email directly or view it on GitHub < https://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-37270305> . < https://github.com/notifications/beacon/6509968__eyJzY29wZSI6Ik5ld3NpZXM6QmVhY29uIiwiZXhwaXJlcyI6MTcxMDE0MzE3MiwiZGF0YSI6eyJpZCI6MjU3NzkxNDV9fQ==--4a58178f8d23bdad2cc3c29af3a6c460a38ea3a3.gif>

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/99#issuecomment-37272498 .