macOS Kawpow Errors [Or mistake in setup]

Sammed98 commented 3 years ago

Describe the bug I want to mine using Kawpow algorithm on my GPU. I have also noted that Kawpow does not run on CPU and I am fine with it. I have modified the config.json file where I have changed the algo value, the pool address, coin and my address. But when I execute xmrig the get the following set of errors. I don't know if they are error but since they are getting consoled with red color I think they are. The error image is as follows:

The config file is as follows: { "api": { "id": null, "worker-id": null }, "http": { "enabled": false, "host": "127.0.0.1", "port": 0, "access-token": null, "restricted": true }, "autosave": true, "background": false, "colors": true, "title": true, "randomx": { "init": -1, "init-avx2": -1, "mode": "auto", "1gb-pages": false, "rdmsr": true, "wrmsr": false, "cache_qos": false, "numa": true, "scratchpad_prefetch_mode": 1 }, "cpu": { "enabled": false, "huge-pages": true, "huge-pages-jit": false, "hw-aes": null, "priority": null, "memory-pool": false, "yield": true, "asm": true, "argon2-impl": null, "astrobwt-max-size": 550, "astrobwt-avx2": false, "argon2": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "astrobwt": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "cn": [ [1, 0], [1, 2], [1, 4], [1, 6], [1, 8], [1, 10] ], "cn-heavy": [ [1, 0], [1, 2], [1, 4] ], "cn-lite": [ [1, 0], [1, 1], [1, 2], [1, 3], [1, 4], [1, 5], [1, 6], [1, 7], [1, 8], [1, 9], [1, 10], [1, 11] ], "cn-pico": [ [2, 0], [2, 1], [2, 2], [2, 3], [2, 4], [2, 5], [2, 6], [2, 7], [2, 8], [2, 9], [2, 10], [2, 11] ], "cn/upx2": [ [2, 0], [2, 1], [2, 2], [2, 3], [2, 4], [2, 5], [2, 6], [2, 7], [2, 8], [2, 9], [2, 10], [2, 11] ], "rx": [0, 2, 4, 6, 8, 10], "rx/wow": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], "cn/0": false, "cn-lite/0": false, "rx/arq": "rx/wow", "rx/keva": "rx/wow" }, "opencl": { "enabled": true, "cache": true, "loader": null, "astrobwt": [ { "index": 0, "intensity": 64, "threads": [-1, -1] }, { "index": 1, "intensity": 192, "threads": [-1, -1] } ], "cn": [ { "index": 1, "intensity": 320, "worksize": 8, "strided_index": [1, 2], "threads": [-1, -1], "unroll": 8 } ], "cn-heavy": [ { "index": 1, "intensity": 160, "worksize": 8, "strided_index": [1, 2], "threads": [-1, -1], "unroll": 8 } ], "cn-lite": [ { "index": 0, "intensity": 192, "worksize": 8, "strided_index": [0, 2], "threads": [-1], "unroll": 8 }, { "index": 1, "intensity": 800, "worksize": 8, "strided_index": [1, 2], "threads": [-1, -1], "unroll": 8 } ], "cn-pico": [ { "index": 0, "intensity": 384, "worksize": 8, "strided_index": [0, 2], "threads": [-1], "unroll": 8 }, { "index": 1, "intensity": 1920, "worksize": 8, "strided_index": [2, 2], "threads": [-1, -1], "unroll": 8 } ], "cn/2": [ { "index": 1, "intensity": 320, "worksize": 8, "strided_index": [2, 2], "threads": [-1, -1], "unroll": 8 } ], "cn/upx2": [ { "index": 0, "intensity": 384, "worksize": 8, "strided_index": [0, 2], "threads": [-1], "unroll": 8 }, { "index": 1, "intensity": 1920, "worksize": 8, "strided_index": [2, 2], "threads": [-1, -1], "unroll": 8 } ], "kawpow": [ { "index": 0, "intensity": 6291456, "worksize": 256, "threads": [-1] }, { "index": 1, "intensity": 5242880, "worksize": 256, "threads": [-1] } ], "rx": [ { "index": 0, "intensity": 320, "worksize": 8, "threads": [-1], "bfactor": 6, "gcn_asm": false, "dataset_host": true }, { "index": 1, "intensity": 320, "worksize": 8, "threads": [-1, -1], "bfactor": 6, "gcn_asm": false, "dataset_host": false } ], "rx/arq": [ { "index": 0, "intensity": 384, "worksize": 8, "threads": [-1], "bfactor": 6, "gcn_asm": false, "dataset_host": true }, { "index": 1, "intensity": 320, "worksize": 8, "threads": [-1, -1], "bfactor": 6, "gcn_asm": false, "dataset_host": false } ], "rx/wow": [ { "index": 0, "intensity": 384, "worksize": 8, "threads": [-1], "bfactor": 6, "gcn_asm": false, "dataset_host": true }, { "index": 1, "intensity": 320, "worksize": 8, "threads": [-1, -1], "bfactor": 6, "gcn_asm": false, "dataset_host": false } ], "cn/0": false, "cn-lite/0": false }, "cuda": { "enabled": false, "loader": null, "cn/0": false, "cn-lite/0": false }, "log-file": null, "donate-level": 0, "donate-over-proxy": 0, "pools": [ { "algo": "kawpow", "coin": null, "url": "poolAddress", "user": "CoinName:Address.workerName", "pass": "x", "rig-id": null, "nicehash": false, "keepalive": false, "enabled": true, "tls": false, "tls-fingerprint": null, "daemon": false, "socks5": null, "self-select": null, "submit-to-origin": false } ], "retries": 5, "retry-pause": 5, "print-time": 60, "dmi": true, "syslog": false, "tls": { "enabled": false, "protocols": null, "cert": null, "cert_key": null, "ciphers": null, "ciphersuites": null, "dhparam": null }, "dns": { "ipv6": false, "ttl": 30 }, "user-agent": null, "verbose": 0, "watch": true, "pause-on-battery": false, "pause-on-active": false }

The modifications that I have made in the config file are I changed cpu enabled to false, openCL enabled to true, change the algo name to kawpow. The pool address , hashname and workername have the appropriate values. The coin I want to mine is dogecoin. The type of GPU I have can be found in the error image.

I would like to know what mistake I have done and how to rectify the same.

To Reproduce Utilize the above config.json, add the pool address, wallet address and workername and execute the executable.

Expected behavior Mine coin on the GPU without utlizing CPU

Required data

Miner log as text or screenshot - Provided Avoe
Config file or command line - Provided Above
OS: Mac OS
For GPU related issues: Intel(R) UHD Graphics 630 and AMD Radeon Pro 5300M Compute Engine

Additional context Add any other context about the problem here.

Spudz76 commented 3 years ago

I looked around and the kawpow defs use AMD platform as the default for some reason, which seems to be why it is trying to use AMD extensions with the Intel.

Then the actual AMD device doesn't dump a compilation backtrace so that could be anything.

SChernykh commented 3 years ago

Remove index 0 GPU from kawpow opencl config, Intel iGPU is not supported and can't even give competitive hashrate. That said, compilation on GPU 1 (AMD) also failed for you because of internal compiler error cvms_element_build_from_source. MacOS just has really bad OpenCL support.

Sammed98 commented 3 years ago

Ok. I can remove the index 0 component. What should I do regarding the OpenCL error on AMD? Any solution for this?

Or is there any other good miner which works with Mac OS and AMD Radeon Pro 5300M? Dosen't matter which coin it mines.

Sammed98 commented 3 years ago

@xmrig Any comments?

Spudz76 commented 3 years ago

dev branch just got #2379 which may allow things to work. Please pull the current dev and compile and try.

I was fixing the OpenCL on Apple M1 and the same thing made it unworkable, now it works. So it should also work for Apple OpenCL + other GPUs. The AMD even though it's an AMD, is behind the Apple OpenCL "firewall" and thus does not have the AMD extensions usually available on AMD GPUs with AMD drivers and a non-abandoned OpenCL layer. So it also must run the "non-AMD" kawpow kernel.

All CN-based algos should work already (Haven might be worth exchange-mining... also probably works on the CPU at the same time unlike kawpow).

ganzocrypt commented 3 years ago

I was getting the same error. I am on Catalina with 2 Xeon and a AMD RX 580 8Gb. CPU mining is ok. I tried to recompile with the dev brench as @ Spudz76 mentioned but got the same issue. The top error appears as soon as you start ./xmrig

Screen Shot 2021-05-18 at 6 51 25 PM

Spudz76 commented 3 years ago

The early error for param 0x4037 is due to trying to use CL_DEVICE_TOPOLOGY_AMD but the OpenCL is not AMD (it's Apple) so even though it's an AMD device it doesn't have any AMD extensions (Apple OpenCL is dead standard 1.2)

Likely also the issue on the rest I will check for more places things assume AMD (or use the device being AMD as ok to use extensions)...

Try with platform: "APPLE", under opencl section of config.json

Spudz76 commented 3 years ago

I know the platform in the config won't help the rest of the problems, but this branch from my fork might actually do something.

@ganzocrypt

ganzocrypt commented 3 years ago

@Spudz76 will try to compile it and test, thx

ganzocrypt commented 3 years ago

@Spudz76 So the top error is gone but the bottom ones are not:

[2021-05-19 11:41:28.641] opencl GPU #0 compiling... [2021-05-19 11:41:35.906] opencl error CL_BUILD_PROGRAM_FAILURE when calling clBuildProgram BUILD LOG: Error returned by cvms_element_build_from_source [2021-05-19 11:41:35.906] opencl thread #0 failed with error CL_INVALID_PROGRAM [2021-05-19 11:41:35.907] opencl GPU #0 compiling... [2021-05-19 11:41:35.907] opencl thread #0 self-test failed [2021-05-19 11:41:35.909] opencl error CL_BUILD_PROGRAM_FAILURE when calling clBuildProgram BUILD LOG: Error returned by cvms_element_build_from_source [2021-05-19 11:41:35.909] opencl thread #1 failed with error CL_INVALID_PROGRAM [2021-05-19 11:41:35.909] opencl thread #1 self-test failed [2021-05-19 11:41:35.909] opencl disabled (failed to start threads)

ganzocrypt commented 3 years ago

@Spudz76 there is someone found an indication where the problem is, could be this the issue? Look the last post.

Spudz76 commented 3 years ago

Interesting clue but still unclear why. atomic_inc() is used other places, and is an old standard OpenCL 1.2 call so AppleCL should support it fine.

Sometimes OpenCL actually gives a compilation log rather than a whole bunch of no-info.

This environment var seems it might make it say more.

ganzocrypt commented 3 years ago

I tested by first commenting out the mentioned code, same error. But I am not familiar with the OpenCL, are the different function based on the different algos? Can really debug much here other that recompiling and do a quick test. If you have any suggestion and want to test something let me know. thx

Spudz76 commented 3 years ago

Set export CL_LOG_ERRORS="stderr" and see if it says more about the compilation failure.

ganzocrypt commented 3 years ago

I do not have compilation failure, just the same error at runtime.

Spudz76 commented 3 years ago

well there is a CL_BUILD_PROGRAM_FAILURE just before it tries to send the broken result of compilation as a kernel which then throws the CL_INVALID_PROGRAM

was hoping with the CL_LOG_ERRORS thing it might say more than Error returned by cvms_element_build_from_source

but still curious why it continues and loads the broken compilation result and tries to use it

ganzocrypt commented 3 years ago

oh ok you were referring to the compile during runtime in the screenshot, sorry got confused. where can I put CL_LOG_ERRORS in the code ?

Spudz76 commented 3 years ago

put export CL_LOG_ERRORS="stderr" into shell, then from same shell run xmrig

ganzocrypt commented 3 years ago

here Screen Shot 2021-05-21 at 1 23 19 PM

brianmcfadden commented 3 years ago

Hey there,

Firstly, I don't think the atomic_inc() from the issue in xmrig-amd wouldn't be the same issue here, as this issue has something to do with amd_bitalign. Curious note that there are 2 different versions of xmr_amd_bitalign defined in src/backend/opencl/cl/cn/wolf-skein.cl. Maybe you noticed that already..

Second, I see now that it wasn't Spudz76-dev-fixCLKawPowPlatformHandling, it was your fork with branch dev-fixAppleOpenCL.

I'm building the code from dev-fixAppleOpenCL and it builds OK, and the OpenCL code is building fine for me inside the GPU, but my GPU is too small to do anything useful, I'm afraid.

OPENCL GPU #1 n/a AMD Radeon Pro 455 Compute Engine 855 MHz cu:12 mem:512/2048 MB

So that's a whopping 2GB of memory on the card, and unfortunately it seems like we need 3G to run for kawpow:

[2021-05-22 19:11:21.565] opencl use profile kawpow (1 thread) scratchpad 32 KB | # | GPU | BUS ID | INTENSITY | WSIZE | MEMORY | NAME | 0 | 1 | n/a | 6291456 | 256 | 2949 | AMD Radeon Pro 455 Compute Engine [2021-05-22 19:11:21.566] opencl GPU #1 compiling... [2021-05-22 19:11:21.567] opencl GPU #1 compilation completed (1 ms) [2021-05-22 19:11:21.567] opencl READY threads 1/1 (2 ms) [2021-05-22 19:11:21.571] opencl KawPow program for period 588563 compiled (4ms) [2021-05-22 19:11:21.571] opencl error CL_INVALID_BUFFER_SIZE when calling clCreateBuffer with buffer size 3053453312 [2021-05-22 19:11:21.571] opencl thread #0 failed with error CL_INVALID_BUFFER_SIZE

If that's accurate, I then my Radeon 455 won't work, but the ganzocrypt's Radeon 580 should (8GB). Sorry, I can't be of more help here, but on the plus side that branch should compile OK.

Sammed98 commented 3 years ago

@ganzocrypt Can you check the branch https://github.com/Spudz76/xmrig/tree/dev-fixAppleOpenCL as mentioned by @brianmcfadden? Kawpow has 3 GB GPU minimum requirement and I think you have 8 GB.

Spudz76 commented 3 years ago

@brianmcfadden thanks for the test, could you test out some other algos everything except RandomX or KawPow families should work with 2GB...

ganzocrypt commented 3 years ago

@sammed-ai yes got 580 8gb, I am testing cn/upx2 which before was cn-extremelite algo. I can try to test for kawpow, which coin should I test on? do you have a config that I can use so we are on the same page?

@brianmcfadden could you try to cn/upx2 algo and see it mines?

Sammed98 commented 3 years ago

@ganzocrypt Can you test for Kawpow algorithm on any coin? I don't think the coin matters. The only thing matters would be the algorithm and the xmrig application.

The config on my very first issue description should work find. It has cpu set to false, opencl set to try, algo as kawpow. You need to put the appropriate pool address , and user. I used placeholder strings while creating the issue.

ganzocrypt commented 3 years ago

Hey, it works ! But the screen gets very sluggish !

Screen Shot 2021-05-23 at 2 08 32 PM

ganzocrypt commented 3 years ago

I reduce the intensity to 1024 and mining ETH and seems fine, using kapow on https://unmineable.com/! GPU load is at 70% screen is ok, can work ! So the issue not compiling was for the other algo upx2.

Not sure if you can answer, which of the supported algo on xmrig would closer to ethah? I can now mine ETH ! :) and use the CPU to mine the other coin since I have a dual Xeon !

Sammed98 commented 3 years ago

Oh. Cool. Could you somehow send the compiled application/zip file here? As a MEGA link or drive link. Which I can download, modify the config file and execute.

ganzocrypt commented 3 years ago

Not sure if it will work since I compiled on my machine, let me know. xmrig.zip

ganzocrypt commented 3 years ago

configKAPOW.json.zip

Spudz76 commented 3 years ago

Good stuff. I will check into the UPX2 problem, maybe it also crashes on other platforms with more generic opencl1.2 stacks - or even Linux+AMD because I'm not sure I ever tested it.

Spudz76 commented 3 years ago

Although I think it did work on Apple M1 + M1 GPU when I was testing this patch for that... hmm

ganzocrypt commented 3 years ago

@Sammed98 did it run ok?

@Spudz76 looks like the upx2 might be very particular.

Also do you know if kapow is core or memory intensive for the GPU ?

Sammed98 commented 3 years ago

Hey, I got this error when I ran the zip "dyld: Library not loaded: /Users/biskero/Developer/homebrew/opt/hwloc/lib/libhwloc.15.dylib"

And I checked the JSON file and there were a lot of modifications. Could you point me at the modifications which are concerned with GPU execution with kawpow algorithm on AMD chip (OpenCL)?

ganzocrypt commented 3 years ago

I compiled with dynamic libs, so it won't work on your machine. About the config, just enabled the OpenCL, add the platform "Apple" set the "intensity": 1024 (else screen freeze! Also I have 2 Xeon so you might find differences there since I have 20 cores/40 threads but that should not affect you if you do not use CPU mining. Everything else is the same as normal configuration.

Sammed98 commented 3 years ago

So, where can I find the steps to compile on my system? Did you use the Basic build steps or advanced build steps?

And what is the "intensity" variable? How much MB of GPU to use for mining?

ganzocrypt commented 3 years ago

here, I use basic, https://xmrig.com/docs/miner/build/macos I set the intensity to 1024, is like how much compute you want the card to perform. I would leave it at 1024. I do not thing you can set the MB, the DAG takes 2.9Mb

Sammed98 commented 3 years ago

I got this error somewhere in between when I compiled the dev-fixAppleOpenCL branch.

And this error when I run the xmrig application from the build folder.

Should I move the xmrig application file somewhere else and execute it?

I did not use the alternative step 4 since I have an Intel Mac and not a M1 Mac.

ganzocrypt commented 3 years ago

the warning is nothing. also it looks like your config is not right, use the one I gave you and change only your address

ganzocrypt commented 3 years ago

can you try this xmrig, it should have static libs xmrig.zip

Sammed98 commented 3 years ago

I tried with your config file and it started the mining process. Thank you. Incase I get any errors of some sort I will also check the static libs.

I think you can create a pull request with the modified code.

I have a Intel(R) UHD Graphics 630 GPU too. Will this code work on this too?

ganzocrypt commented 3 years ago

If you have time please test the last xmrig I sent you so I know the libs are statics and work on other mac, thx

btw you can use xmrig to mine on 2 different coins like one with CPU and the other with GPU.

about the intel GPU not sure, you just need to figure out the index

Spudz76 commented 3 years ago

@Sammed98 Your Intel GPU should be index: 0 while the AMD is index: 1

If you edit the config.json and delete every algo-definition under the opencl section then run again it should set up to run on both... like two entries under each algo one for each index.

Or, mass replace index1 for index0 but the thread/block sizing is probably wrong then and may still not work.

Or if the config from ganzo already had index:0 then it's already running on the Intel lol.

Sammed98 commented 3 years ago

I don't think I can mine on Intel because it has a VRAM of 1536 MB. But I still get this error when I run xmrig only on AMD.

"error CL_INVALID_VALUE when calling clGetProgramInfo"

I googled about this error and found that it indicates that the VRAM requirement is not met. Is this correct? Even, if this is correct I have a AMD with 4 GB of VRAM and as far as I know kawpow only requires 3GB VRAM. How should I solve this issue?

Spudz76 commented 3 years ago

Yes I have run KawPow on 4GB but through CUDA maybe there is some alignment thing with OpenCL allocations where it needs slack space due to some spec rules (and maybe AMD-CL allows/ignores). I'll see if it works via nvidiaCL this same patch seems to have helped those too. This patch should always force OpenCL 1.2 compatibility on all Apples so it shouldn't be a bad clGetProgamInfo call but I will check for AppleCL quirks. It may be a sloppy spot I didn't dig deeply into the algo code yet mainly got the detection and compilation (mostly) working which is exposing some of the incompatibilities deeper inside the code.

It may also be that Apple won't hand out a full >2GB allocation but would do it if done in smaller chunks. Not sure if the KawPow code has that workaround in it (try full alloc -> catch the fail -> try smaller allocations and aggregate them instead -> fail if both don't work). And if Apple always caps allocations then just always use the aggregation mode on that platform.

I forgot you were running an algo that would need more than the Intel has. But it should work with most other algos (not RandomX it also needs a 2336MB dataset).

Spudz76 commented 3 years ago

There was only one place where clGetProgramInfo was called and it didn't need to be there unless debugging. Readjusted some #if defined() logic so the debug code is never compiled/called when it is not needed. There should be no calls to clGetProgramInfo now if you can get a recompile from my fixAppleOpenCL branch. I decided since KawPow does work on M1 now it must not be an Apple-vs-allocations problem, unless there is some extra limitation with AMD for some reason.

ganzocrypt commented 3 years ago

@Sammed98 did you had a chance to test the xmrig that I posted, so that I know the build works on other macos machine? thx

ef651100 commented 3 years ago

Hello, I stumbled upon the fixAppleOpenCL branch to fix this exact same issue with XMRig. I still experienced the same issue after building, so I have looked at the code. I made a small adjustment to /src/backend/opencl/wrappers/OclDevice.cpp to add another string for my iMac which uses the Radeon Pro 580:

if (name.contains("Pro 580")) { return OclDevice::Polaris; }

And this compiled and no issues on run!

Sammed98 commented 3 years ago

@ganzocrypt I already checked the static ones. It worked. I mentioned this before.

@Spudz76 can you cross check what @ef651100 has modified in the code?

ganzocrypt commented 3 years ago

@Sammed98 ok great thx! @ef651100 we did not have problems mining KaPow, it was for UPX2 that was not working. Btw I see that you are running with very high intensity, does it affects your screen or your setup does't care?

ganzocrypt commented 3 years ago

@Sammed98 I made the change if (name.contains("Pro 580")) { return OclDevice::Polaris; } from @ef651100 Can you test it on your machine since it might affects you since the device it shows when you execute xmrig is "Pro 580" xmrig.zip

xmrig / xmrig

macOS Kawpow Errors [Or mistake in setup] #2345