cbuchner1 / CudaMiner

a CUDA accelerated litecoin mining application based on pooler's CPU miner
Other
692 stars 304 forks source link

M2090 (fermi) issues - result does not validate on CPU #72

Open choseh opened 10 years ago

choseh commented 10 years ago

I finally got the latest version compiled with cuda 5.0, unfortunately those Fermi cards won't get anything validated. What could be the issue here?

[2014-01-21 10:27:17] GPU #7: 223.42 khash/s with configuration F160x8 [2014-01-21 10:27:17] GPU #7: using launch configuration F160x8 [2014-01-21 10:27:17] GPU #6: 223.47 khash/s with configuration F160x8 [2014-01-21 10:27:17] GPU #6: using launch configuration F160x8 [2014-01-21 10:27:18] GPU #7: Tesla M2090, 217.14 khash/s [2014-01-21 10:27:18] GPU #6: Tesla M2090, 217.19 khash/s [2014-01-21 10:27:18] GPU #0: 223.54 khash/s with configuration F144x8 [2014-01-21 10:27:18] GPU #0: using launch configuration F144x8 [2014-01-21 10:27:18] GPU #4: 223.36 khash/s with configuration F160x8 [2014-01-21 10:27:18] GPU #4: using launch configuration F160x8 [2014-01-21 10:27:18] GPU #5: 223.33 khash/s with configuration F144x8 [2014-01-21 10:27:18] GPU #5: using launch configuration F144x8 [2014-01-21 10:27:18] GPU #0: Tesla M2090, 209.18 khash/s [2014-01-21 10:27:18] GPU #4: Tesla M2090, 217.05 khash/s [2014-01-21 10:27:18] GPU #2: 222.08 khash/s with configuration F159x8 [2014-01-21 10:27:18] GPU #2: using launch configuration F159x8 [2014-01-21 10:27:18] GPU #3: 222.27 khash/s with configuration F159x8 [2014-01-21 10:27:18] GPU #3: using launch configuration F159x8 [2014-01-21 10:27:19] GPU #5: Tesla M2090, 215.93 khash/s [2014-01-21 10:27:19] GPU #2: Tesla M2090, 215.82 khash/s [2014-01-21 10:27:19] GPU #3: Tesla M2090, 215.92 khash/s [2014-01-21 10:27:19] GPU #1: 222.01 khash/s with configuration F159x8 [2014-01-21 10:27:19] GPU #1: using launch configuration F159x8 [2014-01-21 10:27:19] GPU #1: Tesla M2090, 215.90 khash/s [2014-01-21 10:27:25] GPU #1: Tesla M2090 result does not validate on CPU (i=21245, s=0)! [2014-01-21 10:27:26] GPU #1: Tesla M2090 result does not validate on CPU (i=9427, s=1)! [2014-01-21 10:27:26] Stratum detected new block [2014-01-21 10:27:26] GPU #1: Tesla M2090, 205.85 khash/s [2014-01-21 10:27:26] GPU #0: Tesla M2090, 204.51 khash/s [2014-01-21 10:27:26] GPU #7: Tesla M2090, 208.19 khash/s [2014-01-21 10:27:26] GPU #5: Tesla M2090, 206.02 khash/s [2014-01-21 10:27:26] GPU #4: Tesla M2090, 207.96 khash/s [2014-01-21 10:27:26] GPU #2: Tesla M2090, 211.78 khash/s [2014-01-21 10:27:27] GPU #6: Tesla M2090, 213.75 khash/s [2014-01-21 10:27:27] GPU #3: Tesla M2090, 211.94 khash/s [2014-01-21 10:27:28] Stratum detected new block [2014-01-21 10:27:28] GPU #3: Tesla M2090, 195.38 khash/s [2014-01-21 10:27:28] GPU #4: Tesla M2090, 191.33 khash/s [2014-01-21 10:27:29] GPU #0: Tesla M2090, 192.52 khash/s [2014-01-21 10:27:29] GPU #1: Tesla M2090, 192.31 khash/s [2014-01-21 10:27:29] GPU #7: Tesla M2090, 193.30 khash/s [2014-01-21 10:27:29] GPU #5: Tesla M2090, 192.45 khash/s [2014-01-21 10:27:29] GPU #2: Tesla M2090, 197.24 khash/s [2014-01-21 10:27:29] GPU #6: Tesla M2090, 198.62 khash/s [2014-01-21 10:27:53] Stratum detected new block [2014-01-21 10:27:53] GPU #6: Tesla M2090, 216.69 khash/s [2014-01-21 10:27:53] GPU #3: Tesla M2090, 215.43 khash/s [2014-01-21 10:27:53] GPU #1: Tesla M2090, 209.81 khash/s [2014-01-21 10:27:53] GPU #5: Tesla M2090, 208.21 khash/s [2014-01-21 10:27:53] GPU #7: Tesla M2090, 210.81 khash/s [2014-01-21 10:27:53] GPU #4: Tesla M2090, 210.93 khash/s [2014-01-21 10:27:53] GPU #0: Tesla M2090, 207.13 khash/s [2014-01-21 10:27:53] GPU #2: Tesla M2090, 215.24 khash/s [2014-01-21 10:27:59] GPU #1: Tesla M2090 result does not validate on CPU (i=27942, s=1)!

+------------------------------------------------------+
| NVIDIA-SMI 5.319.37   Driver Version: 319.37         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla M2090         On   | 0000:0C:00.0     Off |                    0 |
| N/A   N/A    P0   210W /  N/A |     5192MB /  5375MB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla M2090         On   | 0000:0D:00.0     Off |                    0 |
| N/A   N/A    P0   211W /  N/A |     5192MB /  5375MB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla M2090         On   | 0000:11:00.0     Off |                    0 |
| N/A   N/A    P0   196W /  N/A |     5192MB /  5375MB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla M2090         On   | 0000:12:00.0     Off |                    0 |
| N/A   N/A    P0   153W /  N/A |     4710MB /  5375MB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   4  Tesla M2090         On   | 0000:83:00.0     Off |                    0 |
| N/A   N/A    P0   181W /  N/A |     5224MB /  5375MB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   5  Tesla M2090         On   | 0000:84:00.0     Off |                    0 |
| N/A   N/A    P0   216W /  N/A |     5224MB /  5375MB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   6  Tesla M2090         On   | 0000:87:00.0     Off |                    0 |
| N/A   N/A    P0   199W /  N/A |     4710MB /  5375MB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   7  Tesla M2090         On   | 0000:88:00.0     Off |                    0 |
| N/A   N/A    P0   203W /  N/A |     5224MB /  5375MB |     99%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0     16561  ./cudaminer                                        40552MB  |
|    1     16561  ./cudaminer                                        40552MB  |
|    2     16561  ./cudaminer                                        40552MB  |
|    3     16561  ./cudaminer                                        40552MB  |
|    4     16561  ./cudaminer                                        40552MB  |
|    5     16561  ./cudaminer                                        40552MB  |
|    6     16561  ./cudaminer                                        40552MB  |
|    7     16561  ./cudaminer                                        40552MB  |
+-----------------------------------------------------------------------------+
magnux commented 10 years ago

I'm having the same issue with my gtx570, I think the problem is that we need to use Cuda 5.5, I use ubuntu, and the latest drivers in the official repo (319) still don't have support for Cuda 5.5 but only for Cuda 5.0. Weird thing is that cudaminer issues a warning that you need Cuda 5.5, but only with the older 304 driver, with the newer it wont complain, but wont validate results either. So, if you can, manually install Cuda 5.5, and tell us if that solved the issue for you. I think everybody with fermi affected by this.

choseh commented 10 years ago

Yeah i already found out it works with 5.5 - there's some stuff broken in the most recent versions tho, that results in less hashrate. i've been talking to cbuchner yesterday and it seems we found a solution for that. Yet, this is not related to the validation ticket :)

magnux commented 10 years ago

Good news is that after manual install of 5.5, it does work. Here are some fine tips for the upgrade: http://askubuntu.com/questions/380609/anyone-has-successfully-installed-cuda-5-5-on-ubuntu-13-10-64-bit Bad news is that the newer version doesn't seem to get me any more Kh/s :(. But at least I can update from now on.

choseh commented 10 years ago

can you try to build the latest version with the following modification of the Makefile.am?

remove or comment the following two lines:

fermi_kernel.o: fermi_kernel.cu $(NVCC) @CFLAGS@ -Xptxas "-abi=no -v" -arch=sm_20 --maxrregcount=63 $(JANSSON_INCLUDES) -o $@ >-c $<

this will build the fermi_kernel with compute 1.0 instead of sm_20, which made my fermi cards faster again.

magnux commented 10 years ago

I tried but you should be aware that line you're dropping is not the last version, cbuchner1 changed again to the compute_10 version. Anyway, no results, either commenting it, or the old version, or new, it wont make any diference, I'm averaging 250KH/s aprox, and no version makes visible difference for better or worse. Thanks for the tip anyway, keep me updated if you find any way to improve the performance on fermi cards. Mine is a gtx570, launch config: -H 2 -d 0 -i 0 -l F15x16 -C 1 , running on Ubuntu 13.10 with 331.38 nvidia drivers.

kristianfreeman commented 10 years ago

I'm having this same problem with the same card – I have two M2090s and while running two instances of cudaminer, GPU #0 can't validate while the second runs fine. @choseh did you manage to get this solved? Worth mentioning that I had a previous issue (#82) that I resolved by checking out the 2012/12/18 build, so might be possible I need to checkout a more recent commit?

choseh commented 10 years ago

hey imkmf,

i am using the current build compiled with cuda 5.5 and it works fine. I had to edit the Makefile.am tho. The changes are in an earlier post in this thread. Maybe check it out and report back and tell me if it worked for you.

kristianfreeman commented 10 years ago

Interesting. Did you run into anything like #82 with the most recent build? As I mentioned there, I'm a little hesitant to give it a shot as it takes about seventy minutes to reconfigure every time something crashes – I lose my second GPU unless I do a complete reformat.

choseh commented 10 years ago

well, i'm running mine with ./cudaminer -C2 -lF16x16 -o stratum+tcp://whatever:3333 -O user:pass

and all the GPUs are being used. trying to catch up with the newer issues

choseh commented 10 years ago

http://nopaste.narf.at/show/2804/ latest build, no modifications, cuda 5.5 with BUNDLED driver - what driver version are you on? (nvidia-smi)

kristianfreeman commented 10 years ago

@choseh want to take this to email? It sounds like we have a really similar setup, would love to talk about this a bit more in detail – kristian@kristianfreeman.com

In the meantime, some more notes:

I haven't been using the bundled driver - that seems like it might be important. Starting a fresh install right now.

You mentioned the latest build with no modifications - does that assume the Makefile.am change or it's fresh to the point where I could clone it down and we'd be identical?

kristianfreeman commented 10 years ago

Alright, so I ran your cudaminer command on a fresh install, using the bundled driver included with cudaminer. Ran into #82 again, unfortunately. It was the fastest speed I had seen, as well, so that really sucks.

choseh commented 10 years ago

i have my cards set to persistence mode btw, maybe try this? http://www.microway.com/hpc-tech-tips/nvidia-smi_control-your-gpus/

kristianfreeman commented 10 years ago

@choseh, gave this a shot and didn't make any difference, unfortunately. #58 seems to have the same problem as well – looks like something specifically with the M2090s. I'm surprised yours is working because @cbuchner1 mentioned in #82 that there's a bug with CUDA 5.5 and running multiple GPUs under one instance of cudaminer. Tell me your secrets! haha

choseh commented 10 years ago

well first i'm on debian wheezy and cuda 5.5 + bundled driver. i git clone the repo and do an ./autogen.sh && ./configure && make then i launch it inside a screen with ./cudaminer -C2 -lF16x16 -o stratum+tcp://pool:3333 -O user:x

there's really nothing more to it :/

here's an nvidia-smi -q for the first card, the others are the same: http://nopaste.narf.at/show/2807/ and so on - currently not hashing, ECC disabled, but not restarted yet