Closed AndreaLanfranchi closed 5 years ago
I've tested your kernel and observe some invalid share (GPU 0 gave incorrect result. Lower overclocking values if it happens frequently.
)
but with my latest rebased work (based on ifdefelse's reference work. please see https://github.com/ifdefelse/ProgPOW/issues/17 or https://github.com/ethereum-mining/ethminer/pull/1766) no invalid share observed for some hours.
Thanks hackmod ... but your implementation is ProgPoW only. Mine is mixed ethash/progpow
I believe there is some symbol which overlaps among kernels
Our reference implementation is also ProgPoW-only. I'd think an overlapping symbol would cause a more consistent failure.
If you log the block number, header, nonce, and result of the failure can you reproduce it? The fact that OpenCL mining works indicates it's not an algorithm or GPU issue.
Actually I can track down block number header and nonce: for result have to change slightly the kernel as it's not returned in the result structure ... it'll take a little.
Only (I believe) relevant difference in my implementation is that DAG is generated using "classic" ethash kernel.
@ifdefelse
I report here the results of a 1 hour batch on the test rig described above. Workprovider is a local geth node in sync with gangnam test net.
Block : 150,194
Header : 0x54d4acc043e26aef4d5058f2e17cfb7c6a61a8bbeb840fcae0a2c29236d0aecc
Boundary : 0x00000009608c69d5693877f008f57b9ca1c81d66c21ebe31419416af9b8faab5
Nonce : 0x3f9ba7b85ae38c75
From GPU : 0x0000000347baac2b
From CPU : 0xbbeaf6959ab793bd19fad208bda17d176eb752f67d481347d9a48d0de6fb96a7
Block : 150,213
Header : 0xbb6a55f7f6105e15fef782278ef5b38435d1fc93db50dec632efc9e6e73388d1
Boundary : 0x00000009573192d60cda98190aa305015379bf2ed6018da2972cbba7ea9fede8
Nonce : 0x4b1e57b661ab6e5d
From GPU : 0x0000000259c8fff3
From CPU : 0x9d10dd29f2131f78a430e6413f660714a3489abdc4ec39653d5dff5b79d24f16
Block : 150,222
Header : 0x26293c8a7ca191422a63d0482b100c64321c112432f51dccbb2fa78fd6890017
Boundary : 0x000000095e370ee5f520ba622b2e9fbb489aeeceb8bc9d7ae0ec565be074723f
Nonce : 0x86c0dc4ee43258db
From GPU : 0x000000053a66fef7
From CPU : 0x4d332efb748b2dfa0e26792f0fe05c98fd4b60794ac1a2db25ce0b400e251b95
Block : 150,392
Header : 0xbc490e700adc7a5eec52497716f03a3ca4bade4decdc25151449d103b5f543d4
Boundary : 0x000000094a7c496170a7a58e9c454834b55fa263d4d3fc5e2090e16d30afcf9b
Nonce : 0x4e7d9d8d1633eb76
From GPU : 0x000000019c72872c
From CPU : 0xb5dd24b1d526939c0e19643477d6cb4d3fd0df86f3e4069fa2b6cb9f7f852017
Block : 150,407
Header : 0x061c3e604372f069c49fe2f987d3b481fe850543d87f1da8bb71eebc15e955d7
Boundary : 0x000000094dfb90cac5f4c53238c3c09d81065dc302114ebe15f45191e65eb4cd
Nonce : 0x4fc210f2bb498bc0
From GPU : 0x0000000607380634
From CPU : 0x5fcf766c9ca760d2b598295e44c684802590eb807694246adc61bd41c02a9718
Block : 150,426
Header : 0x264c85189949edf5ddc04769617f81fe14d4c4db9de31a5097c1693bd59cd796
Boundary : 0x000000094cd4ba1bb253a4a91c57ed279c24b794a95c2419eec963639c95efb2
Nonce : 0x3742fcb348196e4c
From GPU : 0x0000000112857af3
From CPU : 0x6f329b5d6f115d661cd2f8a87eb803770a7333b2e4f2be563e88584023a595cf
The result from GPU is only upper 64bits Apparently the GPU is right in returning result which is below target nevertheless CPU re-evaluation of same nonce produces a completely different result.
What drives me mad is that those results are completely random Note :
Just for info after a 14 hours batch without interruptions the ratio of invalid shares is stable around 1.6%
m 12:20:36 ethminer 14:12 A3205:F51 31.14 Mh { cu0 5.21 52C 70% | cu1 5.21 49C 70% | cu2 5.19 57C 70% | cu3 5.17 60C 70% | cu4 5.18 61C 70% | cu5 5.18 60C 70% }
I cant' find any relevant data:
@ifdefelse that's interesting.
Running CUDA (same environment as above) on only 1 stream (so removing kernel concurrency) and the problem automagically disappears. Actually that is the most relevant difference between CUDA and OpenCL (the latter runs only with one queue) while the first uses by default 2 streams (increasable)
You've restructured the order of some of the cuda calls, right? Sounds like there might be a race condition?
Actually no. I didn't change the order of any call.
Only relevant difference (I can think of) among implementations is that streams in ethminer are created with cudaStreamNonBlocking
option so they do not sync with stream 0 (default). (on your implementations instead they're created to sync with default stream).
More feed back. I managed to have streams created without non blocking option but still same results.
More than one stream creates invalids. One stream only rock solid.
If you could test/review my implementation would be highly appreciated.
Will do, but it might take me a few days to get a chance. However I notice you have some general miner and ethash changes mixed along with the progpow changes. Is there a PR with just the progpow changes to review?
Unfortunately not. But I will post here all the commits involving ProgPoW. Thanks for your support.
git bisect
result indicates that after commit https://github.com/AndreaLanfranchi/ethminer/commit/a4e74a0518a invalid share observed.
@ifdefelse I believe there might be error in your implementation logic for concurrent streams which may explain why I experience invalids while you don't.
For instance : your logic appear to have always only 1 stream running thus no kernel concurrency.
When you enter first time here
https://github.com/ifdefelse/ProgPOW/blob/master/libethash-cuda/CUDAMiner.cpp#L605-L609
you have m_current_index == 0
and immediately increment to 1
which, as a consequence, evaluates stream_index == 1
Being in first loop the test
if (m_current_index >= s_numStreams)
fails and you skip this https://github.com/ifdefelse/ProgPOW/blob/master/libethash-cuda/CUDAMiner.cpp#L616-L630 jumping to kernel launch at https://github.com/ifdefelse/ProgPOW/blob/master/libethash-cuda/CUDAMiner.cpp#L631-L638
No other statement gets executed and we jump to the beginning of the loop where m_current_index
gets incremented again resulting now m_current_index == 2
and stream_index == 0
As a consequence the test at https://github.com/ifdefelse/ProgPOW/blob/master/libethash-cuda/CUDAMiner.cpp#L616 succeeds and the loop waits for return from
CUDA_SAFE_CALL(cudaStreamSynchronize(stream));
And here is the problem
Your streams are created without cudaStreamNonBlocking
so even if we're trying to sync a stream which hasn't any running kernel it nevertheless syncs with default null stream which, eventually, waits for completion of all other streams which (according to CUDA manual) do not have option cudaStreamNonBlocking
set : for those streams, the sync with null stream is implicit.
In other words : syncing any "standard" stream is practically the same as calling cudaDeviceSynchronize
Due to this the 2nd kernel is not launched unless the first have finished.
Unless I miss something huge this loop never have two (or more) concurrent kernels, and the optional setting of more than 1 stream does not produce any effect.
In my implementation, at the very beginning of the search, I launch immediately all the stream kernels (without any device or stream sync among them) so I really have kernel concurrency. https://github.com/AndreaLanfranchi/ethminer/blob/master/libethash-cuda/CUDAMiner.cpp#L508-L524
And all following stream syncronizations (being cudaStreamNonBlocking
) wait only for the work of the specific stream to complete without implicit syncronization with null stream.
More feed back: I just cloned your repo and modified this line https://github.com/ifdefelse/ProgPOW/blob/824cd791634204c4cc7e31f84bb76c0c84895bd3/libethash-cuda/CUDAMiner.cpp#L406
From
CUDA_SAFE_CALL(cudaStreamCreate(&m_streams[i]));
To
CUDA_SAFE_CALL(cudaStreamCreateWithFlags(&m_streams[i], cudaStreamNonBlocking));
Built and run ... invalids appear immediately.
Here is the log of a test run on your implementation with the cudaStreamNonBlocking
flag set.
100% invalids.
m 16:58:15|ethminer| ethminer version 0.15.0.dev0
m 16:58:15|ethminer| Build: linux / release +git. 824cd79
cu 16:58:15|ethminer| Using grid size 1024 , block size 512
m 16:58:15|ethminer| not-connected
ℹ 16:58:15|getwork | Connected to 10.0.0.113:8545
ℹ 16:58:15|getwork | Spinning up miners...
ℹ 16:58:15|cuda-0 | No work.
ℹ 16:58:15|cuda-0 | No work.
[ ... ]
ℹ 16:58:15|cuda-0 | No work.
ℹ 16:58:15|cuda-3 | No work.
ℹ 16:58:15|cuda-0 | Initialising miner 0
ℹ 16:58:15|cuda-3 | Initialising miner 3
ℹ 16:58:15|cuda-2 | No work.
ℹ 16:58:15|cuda-2 | Initialising miner 2
ℹ 16:58:15|cuda-5 | No work.
ℹ 16:58:15|cuda-1 | No work.
ℹ 16:58:15|cuda-1 | Initialising miner 1
ℹ 16:58:15|cuda-5 | Initialising miner 5
ℹ 16:58:15|getwork | New pool difficulty: 740.8907 megahashes
ℹ 16:58:15|cuda-4 | No work.
ℹ 16:58:15|getwork | Received new job #e0696191… from 10.0.0.113
ℹ 16:58:15|cuda-4 | Initialising miner 4
cu 16:58:16|cuda-0 | Using device: GeForce GTX 1050 Ti (Compute 6.1)
cu 16:58:16|cuda-0 | Set Device to current
cu 16:58:16|cuda-0 | Resetting device
cu 16:58:16|cuda-3 | Using device: GeForce GTX 1050 Ti (Compute 6.1)
cu 16:58:16|cuda-3 | Set Device to current
cu 16:58:16|cuda-3 | Resetting device
cu 16:58:16|cuda-1 | Using device: GeForce GTX 1050 Ti (Compute 6.1)
cu 16:58:16|cuda-1 | Set Device to current
cu 16:58:16|cuda-1 | Resetting device
cu 16:58:16|cuda-2 | Using device: GeForce GTX 1050 Ti (Compute 6.1)
cu 16:58:16|cuda-2 | Set Device to current
cu 16:58:16|cuda-2 | Resetting device
cu 16:58:16|cuda-5 | Using device: GeForce GTX 1050 Ti (Compute 6.1)
cu 16:58:16|cuda-5 | Set Device to current
cu 16:58:16|cuda-5 | Resetting device
cu 16:58:16|cuda-4 | Using device: GeForce GTX 1050 Ti (Compute 6.1)
cu 16:58:16|cuda-4 | Set Device to current
cu 16:58:16|cuda-4 | Resetting device
cu 16:58:18|cuda-3 | Allocating light with size: 17432512
cu 16:58:18|cuda-1 | Allocating light with size: 17432512
cu 16:58:18|cuda-2 | Allocating light with size: 17432512
cu 16:58:18|cuda-0 | Allocating light with size: 17432512
cu 16:58:18|cuda-4 | Allocating light with size: 17432512
cu 16:58:18|cuda-5 | Allocating light with size: 17432512
cu 16:58:18|cuda-0 | Generating mining buffers
cu 16:58:18|cuda-3 | Generating mining buffers
cu 16:58:18|cuda-4 | Generating mining buffers
cu 16:58:18|cuda-2 | Generating mining buffers
cu 16:58:18|cuda-1 | Generating mining buffers
cu 16:58:18|cuda-5 | Generating mining buffers
cu 16:58:18|cuda-0 | Generating DAG for GPU # 0 with dagBytes: 1115684224 gridSize: 1024
cu 16:58:18|cuda-4 | Generating DAG for GPU # 4 with dagBytes: 1115684224 gridSize: 1024
cu 16:58:18|cuda-3 | Generating DAG for GPU # 3 with dagBytes: 1115684224 gridSize: 1024
cu 16:58:18|cuda-2 | Generating DAG for GPU # 2 with dagBytes: 1115684224 gridSize: 1024
cu 16:58:18|cuda-5 | Generating DAG for GPU # 5 with dagBytes: 1115684224 gridSize: 1024
cu 16:58:18|cuda-1 | Generating DAG for GPU # 1 with dagBytes: 1115684224 gridSize: 1024
m 16:58:20|ethminer| Speed 0.00 Mh/s gpu/0 0.00 gpu/1 0.00 gpu/2 0.00 gpu/3 0.00 gpu/4 0.00 gpu/5 0.00 [A0+0:R0+0:F0] Time: 00:00
ℹ 16:58:21|getwork | Received new job #9aa1366c… from 10.0.0.113
cu 16:58:24|cuda-0 | Finished DAG
cu 16:58:24|cuda-2 | Finished DAG
cu 16:58:24|cuda-5 | Finished DAG
cu 16:58:24|cuda-1 | Finished DAG
cu 16:58:24|cuda-4 | Finished DAG
cu 16:58:24|cuda-3 | Finished DAG
cu 16:58:25|cuda-0 | Compile log:
cu 16:58:25|cuda-0 | JIT info:
ptxas info : 0 bytes gmem, 96 bytes cmem[3]
ptxas info : Compiling entry function '_Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb' for 'sm_61'
ptxas info : Function properties for _Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb
ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 64 registers, 16384 bytes smem, 385 bytes cmem[0], 4 bytes cmem[2]
ptxas info : Function properties for _Z11keccak_f8008hash32_tyS_
ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
cu 16:58:25|cuda-0 | JIT err:
cu 16:58:25|cuda-0 | Mangled name: _Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb
cu 16:58:25|cuda-0 | done compiling
m 16:58:25|ethminer| Speed 0.00 Mh/s gpu/0 0.00 gpu/1 0.00 gpu/2 0.00 gpu/3 0.00 gpu/4 0.00 gpu/5 0.00 [A0+0:R0+0:F0] Time: 00:00
cu 16:58:25|cuda-2 | Compile log:
cu 16:58:25|cuda-1 | Compile log:
cu 16:58:25|cuda-5 | Compile log:
cu 16:58:26|cuda-4 | Compile log:
cu 16:58:26|cuda-3 | Compile log:
cu 16:58:26|cuda-1 | JIT info:
ptxas info : 0 bytes gmem, 96 bytes cmem[3]
ptxas info : Compiling entry function '_Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb' for 'sm_61'
ptxas info : Function properties for _Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb
ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 64 registers, 16384 bytes smem, 385 bytes cmem[0], 4 bytes cmem[2]
ptxas info : Function properties for _Z11keccak_f8008hash32_tyS_
ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
cu 16:58:26|cuda-1 | JIT err:
cu 16:58:26|cuda-5 | JIT info:
ptxas info : 0 bytes gmem, 96 bytes cmem[3]
ptxas info : Compiling entry function '_Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb' for 'sm_61'
ptxas info : Function properties for _Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb
ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 64 registers, 16384 bytes smem, 385 bytes cmem[0], 4 bytes cmem[2]
ptxas info : Function properties for _Z11keccak_f8008hash32_tyS_
ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
cu 16:58:26|cuda-1 | Mangled name: _Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb
cu 16:58:26|cuda-5 | JIT err:
cu 16:58:26|cuda-5 | Mangled name: _Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb
cu 16:58:26|cuda-5 | done compiling
cu 16:58:26|cuda-4 | JIT info:
ptxas info : 0 bytes gmem, 96 bytes cmem[3]
ptxas info : Compiling entry function '_Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb' for 'sm_61'
ptxas info : Function properties for _Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb
ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 64 registers, 16384 bytes smem, 385 bytes cmem[0], 4 bytes cmem[2]
ptxas info : Function properties for _Z11keccak_f8008hash32_tyS_
ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
cu 16:58:26|cuda-1 | done compiling
cu 16:58:26|cuda-4 | JIT err:
cu 16:58:26|cuda-4 | Mangled name: _Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb
cu 16:58:26|cuda-2 | JIT info:
ptxas info : 0 bytes gmem, 96 bytes cmem[3]
ptxas info : Compiling entry function '_Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb' for 'sm_61'
ptxas info : Function properties for _Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb
ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 64 registers, 16384 bytes smem, 385 bytes cmem[0], 4 bytes cmem[2]
ptxas info : Function properties for _Z11keccak_f8008hash32_tyS_
ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
cu 16:58:26|cuda-4 | done compiling
cu 16:58:26|cuda-2 | JIT err:
cu 16:58:26|cuda-2 | Mangled name: _Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb
cu 16:58:26|cuda-2 | done compiling
cu 16:58:26|cuda-3 | JIT info:
ptxas info : 0 bytes gmem, 96 bytes cmem[3]
ptxas info : Compiling entry function '_Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb' for 'sm_61'
ptxas info : Function properties for _Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb
ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 64 registers, 16384 bytes smem, 385 bytes cmem[0], 4 bytes cmem[2]
ptxas info : Function properties for _Z11keccak_f8008hash32_tyS_
ptxas . 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
cu 16:58:26|cuda-3 | JIT err:
cu 16:58:26|cuda-3 | Mangled name: _Z14progpow_searchy8hash32_tyPK5dag_tPV14search_resultsb
cu 16:58:26|cuda-3 | done compiling
✘ 16:58:30|cuda-1 | GPU gave incorrect result!
m 16:58:30|ethminer| Speed 24.38 Mh/s gpu/0 4.72 gpu/1 3.93 gpu/2 3.93 gpu/3 3.93 gpu/4 3.93 gpu/5 3.93 [A0+0:R0+0:F1] Time: 00:00
ℹ 16:58:33|getwork | Received new job #8de10aec… from 10.0.0.113
m 16:58:35|ethminer| Speed 28.49 Mh/s gpu/0 5.01 gpu/1 4.72 gpu/2 4.72 gpu/3 4.66 gpu/4 4.66 gpu/5 4.72 [A0+0:R0+0:F1] Time: 00:00
m 16:58:40|ethminer| Speed 32.09 Mh/s gpu/0 5.35 gpu/1 5.40 gpu/2 5.35 gpu/3 5.30 gpu/4 5.35 gpu/5 5.35 [A0+0:R0+0:F1] Time: 00:00
ℹ 16:58:42|getwork | Received new job #7b36845f… from 10.0.0.113
m 16:58:45|ethminer| Speed 32.03 Mh/s gpu/0 5.39 gpu/1 5.34 gpu/2 5.34 gpu/3 5.29 gpu/4 5.34 gpu/5 5.34 [A0+0:R0+0:F1] Time: 00:00
m 16:58:50|ethminer| Speed 31.98 Mh/s gpu/0 5.34 gpu/1 5.39 gpu/2 5.34 gpu/3 5.29 gpu/4 5.34 gpu/5 5.29 [A0+0:R0+0:F1] Time: 00:00
✘ 16:58:50|cuda-3 | GPU gave incorrect result!
✘ 16:58:51|cuda-3 | GPU gave incorrect result!
ℹ 16:58:54|getwork | Received new job #ec20a51d… from 10.0.0.113
m 16:58:55|ethminer| Speed 32.08 Mh/s gpu/0 5.39 gpu/1 5.39 gpu/2 5.34 gpu/3 5.29 gpu/4 5.34 gpu/5 5.34 [A0+0:R0+0:F3] Time: 00:00
m 16:59:00|ethminer| Speed 32.03 Mh/s gpu/0 5.39 gpu/1 5.34 gpu/2 5.34 gpu/3 5.29 gpu/4 5.34 gpu/5 5.34 [A0+0:R0+0:F3] Time: 00:00
ℹ 16:59:03|getwork | Received new job #12c5beb2… from 10.0.0.113
m 16:59:05|ethminer| Speed 31.73 Mh/s gpu/0 5.32 gpu/1 5.32 gpu/2 5.27 gpu/3 5.27 gpu/4 5.27 gpu/5 5.27 [A0+0:R0+0:F3] Time: 00:00
m 16:59:10|ethminer| Speed 32.03 Mh/s gpu/0 5.39 gpu/1 5.34 gpu/2 5.34 gpu/3 5.34 gpu/4 5.29 gpu/5 5.34 [A0+0:R0+0:F3] Time: 00:00
ℹ 16:59:12|getwork | Received new job #bc1ad797… from 10.0.0.113
m 16:59:15|ethminer| Speed 32.13 Mh/s gpu/0 5.39 gpu/1 5.39 gpu/2 5.34 gpu/3 5.34 gpu/4 5.34 gpu/5 5.34 [A0+0:R0+0:F3] Time: 00:01
m 16:59:20|ethminer| Speed 32.08 Mh/s gpu/0 5.34 gpu/1 5.39 gpu/2 5.34 gpu/3 5.34 gpu/4 5.34 gpu/5 5.34 [A0+0:R0+0:F3] Time: 00:01
ℹ 16:59:24|getwork | Received new job #2450b2c7… from 10.0.0.113
m 16:59:25|ethminer| Speed 32.03 Mh/s gpu/0 5.39 gpu/1 5.34 gpu/2 5.34 gpu/3 5.34 gpu/4 5.29 gpu/5 5.34 [A0+0:R0+0:F3] Time: 00:01
m 16:59:30|ethminer| Speed 32.03 Mh/s gpu/0 5.34 gpu/1 5.39 gpu/2 5.34 gpu/3 5.29 gpu/4 5.34 gpu/5 5.34 [A0+0:R0+0:F3] Time: 00:01
ℹ 16:59:33|getwork | Received new job #8aada8e3… from 10.0.0.113
m 16:59:35|ethminer| Speed 31.76 Mh/s gpu/0 5.29 gpu/1 5.34 gpu/2 5.29 gpu/3 5.24 gpu/4 5.29 gpu/5 5.29 [A0+0:R0+0:F3] Time: 00:01
✘ 16:59:39|cuda-3 | GPU gave incorrect result!
m 16:59:40|ethminer| Speed 31.98 Mh/s gpu/0 5.39 gpu/1 5.34 gpu/2 5.34 gpu/3 5.29 gpu/4 5.34 gpu/5 5.29 [A0+0:R0+0:F4] Time: 00:01
m 16:59:45|ethminer| Speed 31.99 Mh/s gpu/0 5.34 gpu/1 5.39 gpu/2 5.34 gpu/3 5.29 gpu/4 5.29 gpu/5 5.34 [A0+0:R0+0:F4] Time: 00:01
ℹ 16:59:45|getwork | Received new job #bd704477… from 10.0.0.113
m 16:59:50|ethminer| Speed 32.03 Mh/s gpu/0 5.39 gpu/1 5.34 gpu/2 5.34 gpu/3 5.29 gpu/4 5.34 gpu/5 5.34 [A0+0:R0+0:F4] Time: 00:01
ℹ 16:59:54|getwork | Received new job #01dbb003… from 10.0.0.113
m 16:59:55|ethminer| Speed 31.67 Mh/s gpu/0 5.35 gpu/1 5.30 gpu/2 5.24 gpu/3 5.24 gpu/4 5.30 gpu/5 5.24 [A0+0:R0+0:F4] Time: 00:01
m 17:00:00|ethminer| Speed 32.03 Mh/s gpu/0 5.34 gpu/1 5.39 gpu/2 5.34 gpu/3 5.29 gpu/4 5.34 gpu/5 5.34 [A0+0:R0+0:F4] Time: 00:01
ℹ 17:00:03|getwork | Received new job #d8f4bc20… from 10.0.0.113
m 17:00:05|ethminer| Speed 31.63 Mh/s gpu/0 5.31 gpu/1 5.31 gpu/2 5.26 gpu/3 5.21 gpu/4 5.26 gpu/5 5.26 [A0+0:R0+0:F4] Time: 00:01
m 17:00:10|ethminer| Speed 32.13 Mh/s gpu/0 5.39 gpu/1 5.39 gpu/2 5.34 gpu/3 5.34 gpu/4 5.34 gpu/5 5.34 [A0+0:R0+0:F4] Time: 00:01
✘ 17:00:15|cuda-2 | GPU gave incorrect result!
m 17:00:15|ethminer| Speed 32.11 Mh/s gpu/0 5.36 gpu/1 5.41 gpu/2 5.36 gpu/3 5.31 gpu/4 5.31 gpu/5 5.36 [A0+0:R0+0:F5] Time: 00:02
ℹ 17:00:15|getwork | Received new job #0da09b32… from 10.0.0.113
m 17:00:20|ethminer| Speed 32.08 Mh/s gpu/0 5.39 gpu/1 5.34 gpu/2 5.34 gpu/3 5.34 gpu/4 5.34 gpu/5 5.34 [A0+0:R0+0:F5] Time: 00:02
ℹ 17:00:24|getwork | Received new job #1199cc91… from 10.0.0.113
m 17:00:25|ethminer| Speed 31.57 Mh/s gpu/0 5.30 gpu/1 5.30 gpu/2 5.24 gpu/3 5.24 gpu/4 5.24 gpu/5 5.24 [A0+0:R0+0:F5] Time: 00:02
m 17:00:30|ethminer| Speed 31.98 Mh/s gpu/0 5.39 gpu/1 5.34 gpu/2 5.34 gpu/3 5.29 gpu/4 5.34 gpu/5 5.29 [A0+0:R0+0:F5] Time: 00:02
✘ 17:00:32|cuda-0 | GPU gave incorrect result!
ℹ 17:00:33|getwork | Received new job #54e86a1b… from 10.0.0.113
m 17:00:35|ethminer| Speed 31.76 Mh/s gpu/0 5.34 gpu/1 5.34 gpu/2 5.28 gpu/3 5.23 gpu/4 5.28 gpu/5 5.28 [A0+0:R0+0:F6] Time: 00:02
m 17:00:40|ethminer| Speed 32.03 Mh/s gpu/0 5.34 gpu/1 5.39 gpu/2 5.34 gpu/3 5.29 gpu/4 5.34 gpu/5 5.34 [A0+0:R0+0:F6] Time: 00:02
m 17:00:45|ethminer| Speed 31.95 Mh/s gpu/0 5.38 gpu/1 5.33 gpu/2 5.33 gpu/3 5.29 gpu/4 5.29 gpu/5 5.33 [A0+0:R0+0:F6] Time: 00:02
ℹ 17:00:45|getwork | Received new job #e648b278… from 10.0.0.113
✘ 17:00:47|cuda-4 | GPU gave incorrect result!
m 17:00:50|ethminer| Speed 32.03 Mh/s gpu/0 5.39 gpu/1 5.39 gpu/2 5.34 gpu/3 5.29 gpu/4 5.34 gpu/5 5.29 [A0+0:R0+0:F7] Time: 00:02
✘ 17:00:53|cuda-5 | GPU gave incorrect result!
ℹ 17:00:54|getwork | Received new job #f9e290af… from 10.0.0.113
m 17:00:55|ethminer| Speed 31.73 Mh/s gpu/0 5.30 gpu/1 5.30 gpu/2 5.30 gpu/3 5.24 gpu/4 5.30 gpu/5 5.30 [A0+0:R0+0:F8] Time: 00:02
✘ 17:00:58|cuda-4 | GPU gave incorrect result!
m 17:01:00|ethminer| Speed 31.98 Mh/s gpu/0 5.39 gpu/1 5.34 gpu/2 5.29 gpu/3 5.29 gpu/4 5.34 gpu/5 5.34 [A0+0:R0+0:F9] Time: 00:02
✘ 17:01:01|cuda-1 | GPU gave incorrect result!
m 17:01:05|ethminer| Speed 32.03 Mh/s gpu/0 5.34 gpu/1 5.39 gpu/2 5.34 gpu/3 5.34 gpu/4 5.29 gpu/5 5.34 [A0+0:R0+0:F10] Time: 00:02
ℹ 17:01:06|getwork | Received new job #5259b49c… from 10.0.0.113
ℹ 17:01:07|ethminer| Shutting down...
ℹ 17:01:07|ethminer| Disconnected from 10.0.0.113
ℹ 17:01:07|ethminer| Shutting down miners...
@AndreaLanfranchi wrote: More feed back: I just cloned your repo and modified this line
ProgPOW/libethash-cuda/CUDAMiner.cpp
Line 406 in 824cd79
CUDA_SAFE_CALL(cudaStreamCreate(&m_streams[i])); From
CUDA_SAFE_CALL(cudaStreamCreate(&m_streams[i]));
To
CUDA_SAFE_CALL(cudaStreamCreateWithFlags(&m_streams[i], cudaStreamNonBlocking));
Built and run ... invalids appear immediately.
reference work didn't touch m_current_index
https://github.com/ifdefelse/ProgPOW/blame/master/libethash-cuda/CUDAMiner.cpp#L605-L609
see also my rebased work (based on ifelsedef's work) https://github.com/hackmod/ethminer/blame/progpow-master-rebase-v0.9.2/libethash-cuda/CUDAMiner.cpp#L441-L457
in my rebased work CUDA_SAFE_CALL(cudaStreamCreateWithFlags(&m_streams[i], cudaStreamNonBlocking));
problem already solved by commit https://github.com/ethereum-mining/ethminer/commit/efc2e879db48d60c96bae270cd1767ff62c54d4c
resolved by PR https://github.com/AndreaLanfranchi/ethminer/pull/44
Excellent, thanks for the update! Closing this issue out.
Test Environment Linux Ubuntu 16.04 CUDA 10 NVIDIA Driver 410.79 6x Gtx 1050 Ti
Software https://github.com/AndreaLanfranchi/ethminer with 0.9.2 implemented
Running the GPUs on CUDA I get 33 to 35.5 Mh (overall) but keep getting 1.3 to 1.7 invalid shares (2 hours test batch) Same test running the same GPUs on OpenCL I get 30 to 32 Mh but 0% invalid shares. (same 2 hours batch)
For sake of precision I am mining on a private node linked to gangnam network and with minimum diff of 430 Mh (nicehash index 0.1).
Am I missing something ?