Open joesixpack opened 6 years ago
I have the same problem for my 1070 Ti rigs only, for some reason...
My ccminer was crashing on a vmovdqa instruction. I've fixed it by changing the neoscrypt_xor function inside sph/neoscrypt.cpp:
--- neoscrypt.cpp.old 2017-11-21 17:43:36.000000000 +0100
+++ neoscrypt.cpp 2017-12-15 19:19:37.550565124 +0100
@@ -481,10 +481,10 @@
ulong *src = (ulong *) srcp;
uint i, tail;
- for(i = 0; i < (len / sizeof(ulong)); i++)
- dst[i] ^= src[i];
+// for(i = 0; i < (len / sizeof(ulong)); i++)
+// dst[i] ^= src[i];
- tail = len & (sizeof(ulong) - 1);
+ tail = len;// & (sizeof(ulong) - 1);
if(tail) {
uchar *dstb = (uchar *) dstp;
uchar *srcb = (uchar *) srcp;
@KlausT Please implement this.
On my system ccminer is not crashing at all. I can't reproduce this. But ok, I will see what I can do.
These issues are on 1080 Ti and 1070 Ti, FYI
Could be related to the intensity / the memory size. The latest commit will fix it, I hope. Since I don't have a Ti card I can't test it here.
As far as I understand this code runs on the CPU, so it shouldn't be related to the GPU. I was thinking that the issue is related to my compiler, but then I can't explain the crash of the others who use the official binaries. I wonder if the commit fixes their problem, too.
My config if it helps: i7-4790 (Haswell) 1050 Ti Ubuntu 17.10 GCC 7.2 CUDA 8
I'm hoping that this commit: https://github.com/KlausT/ccminer/commit/250e14cbaf4cdd432a223ec0e526a103c6eedb9f will fix the CUDA errors because there was a possible integer overflow that could cause illegal memory accesses on the GPU. All the segfaults on Linux systems have probably other causes.
Please test the latest commits: Source: https://github.com/KlausT/ccminer/archive/cuda9.zip Windows binary: ccminer-test-x64.zip
Compilation of the current git version dies on my linux:
nvcc -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_50,code=sm_50 -gencode=arch=compute_37,code=sm_37 -gencode=arch=compute_35,code=sm_35 -gencode=arch=compute_30,code=sm_30 -I/usr/local/cuda/include -I. -O3 -std=c++11 -Xcompiler -fno-strict-aliasing -Wall -D_FORCE_INLINES --ptxas-options="-v" --maxrregcount=128 -o cuda_groestlcoin.o -c cuda_groestlcoin.cu
nvcc fatal : Unknown option 'Wall'
Makefile:1882: recipe for target 'cuda_groestlcoin.o' failed
If I change "-Xcompiler -fno-strict-aliasing -Wall" to "-Xcompiler -fno-strict-aliasing,-Wall", then it works.
Compilation of the attached cuda9.zip additionally dies with CUDA 8 at:
nvcc -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_50,code=sm_50 -gencode=arch=compute_37,code=sm_37 -gencode=arch=compute_35,code=sm_35 -gencode=arch=compute_30,code=sm_30 -gencode=arch=compute_70,code=sm_70 -I/usr/local/cuda/include -I. -O3 -std=c++11 -Xcompiler -fno-strict-aliasing,-Wall -D_FORCE_INLINES --ptxas-options="-v" --maxrregcount=128 -o cuda_groestlcoin.o -c cuda_groestlcoin.cu
nvcc fatal : Unsupported gpu architecture 'compute_70'
Makefile:1883: recipe for target 'cuda_groestlcoin.o' failed
ok, I have fixed configure.sh (I think) The windows branch is for CUDA 8, the cuda9 branch is for CUDA 9.x
Okay, the Neoscrypt patch referenced earlier by nfllab applied to 8.15 results in a core dump when exiting the miner:
https://github.com/KlausT/ccminer/archive/cuda9.zip on 16.04 or 17.10 has the same compile error as this one: https://github.com/KlausT/ccminer/issues/92
I'm not able to test the pre-compiled Windows version atm (to see if it still crashes mining Neoscrypt on a Ti). Can someone else?
Please don't use Ubuntu 17.10, I don't know if it's compatible. 16.04 should be ok. Or 17.04 for CUDA 9.1 Using the latest Linux versions for compiling ccminer is generally a bad idea
Looks like 8.17 fixed the issues. Knock on wood!
I thought you are using 8,17 now ?
Whoops, not sure what happened there.
Still happening on 8.17. It's not exclusive to NeoScrypt as it happened on Groestl too. It seems less to do with any mining and more about the exiting.
Ooooh, and I was wondering which algorithm threw those errors. So it was KlausT?
Actually, I think it doesn't always result in a simple error window, but I think this is what crashed one of my rigs overnight, making it reboot. Now I'm not sure what to do.
It's not crashing at all on my system, I can't reproduce this.
I think I found another cause for the crashing. But still I have rigs that show exactly such an error window, and several of those stacking when enough time passes. I can't positively tell if the KlausT version is the culprit here, though =(.
It is KlausT. None of the other ccminer's give this problem. Here's a copy of the problem details from the dialog box (8.18 CUDA 9.1):
Problem signature: Problem Event Name: APPCRASH Application Name: ccminer.exe Application Version: 0.0.0.0 Application Timestamp: 5a4a6e89 Fault Module Name: StackHash_48d7 Fault Module Version: 6.1.7601.23915 Fault Module Timestamp: 59b94ee4 Exception Code: c0000374 Exception Offset: 00000000000bf3e2 OS Version: 6.1.7601.2.1.0.256.1 Locale ID: 1033 Additional Information 1: 48d7 Additional Information 2: 48d7d9e7b54549393f69a4a65eee70d7 Additional Information 3: 05fe Additional Information 4: 05feaa65322395330360a8f5ca947f22
Read our privacy statement online: http://go.microsoft.com/fwlink/?linkid=104288&clcid=0x0409
If the online privacy statement is not available, please read our privacy statement offline: C:\Windows\system32\en-US\erofflps.txt
Sadly I still had a lot of problems with KlausT especially on my 1070 Ti and 1080 Ti rigs, so I had to completely deactivate it and switch to TPruvot.
I don't understand why it is crashing on your system, but not on mine. Windows users: what's different on your system? I'm using Windows 10 with all the latest updates 8 GB RAM GTX 1070 Nvidia driver 388.71
By the way, would you please test the latest version?
Try it with 1070 Ti or 1080 Ti I guess?
If you give me the money to buy one
Do you have a PayPal charity link or a wallet address? I guess enough people should be willing to give you enough for a 1070 Ti 👍 It's "just" half an ETH ^^.
Tpruvot is no panacea. It's already sloweddown/lockedup/BSOD/crashed on skein, neoscrypt and lyra2v2 and who knows how many more to come. Basically, every time Tpruvot incorporates a third-party speedup or new algo, it makes the whole enchilada even more unstable.
TPruvot is what works stable for several days now for my 13x 1070Ti rigs. I do me and you do you.
1080Ti's, Windows 7 with latest, NVIDIA 388.59 here. Brain, what NVIDIA driver are using with Tpruvot?
Just so you know, Klaust, this crashing problem on exiting is not specific to any algo, but on anything the miner [tries to] runs. So as I said before, the problem is in the exiting and not the mining. Judging by the text output, you're doing something different on exiting that none of the other ccminers are doing. What is different about this working binary https://github.com/KlausT/ccminer/files/1236886/ccminer-neoscrypt-1080ti-test.zip than the latest (besides having the -r bug)?
1070 Ti, Windows 10 Pro, NVIDIA 388.71 The rig I looked at right now by chance is running TPruvot lyra2z on 13x 1070 Ti for 2 hours straight now. No errors at all.
Seems like CUDA 9.1 is only supported on 388.71 even though the SDK came out long before. Doesn't make a difference to the exit crashing, though.
I use MPM which is multi-algo profit-switching all the time. I see no exit crash problems currently, at least not every day. On one of my rigs I do see this regularly but it also happens on DSTM Equihash, so I think one of the cards just has a tad bit too much OC.
I have made a small change now, maybe this will help. Windows binary: ccminer-818exitfix-debug-cuda91-x64.zip
Interesting, it still exit crashes. Does that mean its the -r fix?
I don't think so. Maybe it's this line that was added two months ago: https://github.com/KlausT/ccminer/blob/f2c02c0454b11ada0098597e6d02c83c9ed2e38a/ccminer.cpp#L460 That would only affect Linux systems, I think.
I have the same issue with GTX 1070Ti & GTX 970 tried with & without an OC same result. Windows 10 Pro, 388.71 driver, 8GB Ram
The crash dialog that pops up is actually WerFault.exe Whether or not you disable error reporting, it will still show up. On a 1080 non-Ti, no crashing on any algo at all.
Does it say something like this?
"The instruction at 0x0000000075983703 referenced memory at 0x0000000000000000. The memory could not be read"
Maybe this could help: answers.microsoft.com
The test miner provided at: https://github.com/KlausT/ccminer/issues/50
...is the only one that doesn't crash on my 1080Ti for neoscrypt. However, the latest 8.15 release still has that crashing bug, but may have the -r 0 fix referenced here: https://github.com/KlausT/ccminer/issues/57
I need both in one miner.