Closed bycarloz closed 6 years ago
Sorry to tell you this but if thats is Monero (XMR) you are mining, then it's not the stak's fault, that blame lies solely on the Monero-Project.
Sorry to tell you this but if thats is Monero (XMR) you are mining, then it's not the stak's fault, that blame lies solely on the Monero-Project.
big problem starts in if the difficulty goes up there is very little hashrate
I think there is a way to recover the hashrate since by increasing the intensity it recovers a bit of hashrate the creators of xmr-stak they will be able to recover the hashrate that they do not lack. only they know how to manipulate XMR code, that is, the agorhythm to optimize a bit more the hashrate
Unfortunately the Monero project chose two instructions to focus on (integer square root and integer divide) that are uniquely performant (for no particular reason) on certain CPUs from Intel and AMD. This means that everything other than those particular CPUs will see a hashrate drop. Furthermore, those CPUs are very owner-hostile and come with quite nasty baked-in DRM antifeatures.
From where I sit, it seems the "anti-ASIC" algorithm is forcing the use of a particular type of CPU that many people wouldn't otherwise use. Some discussion should probably occur about this kind of decision violating the spirit of the anti-ASIC movement. :wink:
sry to say it but your gpus are to old and the change to get better performance out is nearly zero. I am still looking into it but do not expect any large improfements.
Hello All, I first want to say thanks to the XMR-Stack team and community, I think allot of work has gone into this version to improve the software and recover as much hashrate as they could find.
I think for me as a new-ish miner I need to know more about the card options and how the Algo effects them. I use 4GB low power cards so 225H/s looks ok after the change to v8.
My cards were working great with 2.5.1 on v7 but all 4 of my mini rigs would not run on the v8 algo. I had to change the configs drastically for both the AMD and Nvidia cards just to get them to start to hash. The RX550's I think are close to done, maybe some tweaks but the GTX 1050TI's after 2 days of hit or miss changes are running close to solid and at ok rates of 270-280Hs this is after changes to CPU (i7-2600) hyper threading and for the card "sync_mode" and "bfactor".
When I started testing the nvidia cards they would not show any hashrate with the bfactor set to 10 or higher and would crash hard at 7 or lower.
Now that I have things running what should I change to make them as efficient as possible? Is this still valid for all Nvidia Cards T*B/2<=1900 with B Mod =0
I know I could still use some info about what are better settings, where are the most gains, what more should I look at like why the CPU and Sync mode is playing more of a factor for my Nvidia cards , not only for this fork but for future changes as well.
Please let me know if you have found any good threads about configuring the cards so far on Reddit I see lots of the old configs but no real information on how the settings interact and what might be better for this Algo, yet.
I'm kinda interested in what that CPU is that Mad thinks they are going towards.
Made a mistake Nvidia Cards (T B 2 <= 1900 and B mod M == 0) I'm still reading.
Oh I see this is only for cards with compute capability >=2.0 and <6.0 with 2GB limit. Now to find something for my card,
Newer nVidia (maxwell+) tend to like B=8SM and T=2,4,or 8 as best (should be the default autoconfig IF your currency is set to a v8 type BEFORE* regenerating the config (it calculates differently depending on currency setting). I found the default (with currency set correctly) to work rather well:
365H/s:
// gpu: GeForce GTX 970 architecture: 52
// memory: 3840/4096 MiB
// smx: 13
{ "index" : 0,
"threads" : 4, "blocks" : 104,
"bfactor" : 6, "bsleep" : 25,
"affine_to_cpu" : false, "sync_mode" : 3,
"mem_mode" : 1,
},
You may also find OpenCL runs better, use --noNVIDIA --openCLVendor NVIDIA
and see how that goes:
318H/s:
// gpu: GeForce GTX 970 memory:3968
// compute units: 13
{ "index" : 0,
"intensity" : 832, "worksize" : 8,
"affine_to_cpu" : false, "strided_index" : 2, "mem_chunk" : 2,
"unroll" : 8, "comp_mode" : false
},
please keep in mind that you can not get the same hashrate than with v7.
True, this 970 did around 418 before.
my best hashrate what do you think?
I do understand this Algo is taking up hashing power but the Nvidia card needed help so I'm still testing. I'm using windows 10. Monero setting my 1050Ti's just crash with default and my old settings for v7 not sure if it's the same as v8 or not.
v8 default config // gpu: GeForce GTX 1050 Ti architecture: 61 // memory: 3371/4096 MiB // smx: 6 { "index" : 0, "threads" : 8, "blocks" : 18, "bfactor" : 8, "bsleep" : 25, "affine_to_cpu" : false, "sync_mode" : 3, "mem_mode" : 1, Only 140H/s
--openCLVendor NVIDIA creates the same config. Only 162H/s
My current config not working that bad, but based on my faulty knowledge.
// gpu: GeForce GTX 1050 Ti architecture: 61 // memory: 3371/4096 MiB // smx: 6 { "index" : 0, "threads" : 38, "blocks" : 24, "bfactor" : 9, "bsleep" : 32, "affine_to_cpu" : false, "sync_mode" : 1, "mem_mode" : 1, Drum roll, Hash Rate is around 280H/s over all 6 cards 4 above 2 under.
This is a Pascal chip can I use the same Calc B=8*SM and T=2,4,or 8? Now what about sync_mode, changing to 1 cost me 50H/s on the CPU to gain 140+H/s on each Nvidia card, I have not tested 2 = cudaDeviceScheduleYield yet to see if that might work better.
For the ati radeon you must modify the "intensity": 187, "working size": 21,
You will find an incredible improvement in hashrate. only for those of ati
I still can not find a solution for those of NVIDIA, when changing the parameters I achieve only 10% or 20% hashrate instead of ati I found an 80% improvement only modified its intensity and working size
I do understand this Algo is taking up hashing power but the Nvidia card needed help so I'm still testing. I'm using windows 10. Monero setting my 1050Ti's just crash with default and my old settings for v7 not sure if it's the same as v8 or not.
v8 default config // gpu: GeForce GTX 1050 Ti architecture: 61 // memory: 3371/4096 MiB // smx: 6 { "index" : 0, "threads" : 8, "blocks" : 18, "bfactor" : 8, "bsleep" : 25, "affine_to_cpu" : false, "sync_mode" : 3, "mem_mode" : 1, Only 140H/s
--openCLVendor NVIDIA creates the same config. Only 162H/s
My current config not working that bad, but based on my faulty knowledge.
// gpu: GeForce GTX 1050 Ti architecture: 61 // memory: 3371/4096 MiB // smx: 6 { "index" : 0, "threads" : 38, "blocks" : 24, "bfactor" : 9, "bsleep" : 32, "affine_to_cpu" : false, "sync_mode" : 1, "mem_mode" : 1, Drum roll, Hash Rate is around 280H/s over all 6 cards 4 above 2 under.
This is a Pascal chip can I use the same Calc B=8*SM and T=2,4,or 8? Now what about sync_mode, changing to 1 cost me 50H/s on the CPU to gain 140+H/s on each Nvidia card, I have not tested 2 = cudaDeviceScheduleYield yet to see if that might work better.
friend forgets his previous configuration begins to increase with the default configuration of xmr-tak and thus finds the hashrate that you are looking for
I do understand this Algo is taking up hashing power but the Nvidia card needed help so I'm still testing. I'm using windows 10. Monero setting my 1050Ti's just crash with default and my old settings for v7 not sure if it's the same as v8 or not.
v8 default config // gpu: GeForce GTX 1050 Ti architecture: 61 // memory: 3371/4096 MiB // smx: 6 { "index" : 0, "threads" : 8, "blocks" : 18, "bfactor" : 8, "bsleep" : 25, "affine_to_cpu" : false, "sync_mode" : 3, "mem_mode" : 1, Only 140H/s
--openCLVendor NVIDIA creates the same config. Only 162H/s
My current config not working that bad, but based on my faulty knowledge.
// gpu: GeForce GTX 1050 Ti architecture: 61 // memory: 3371/4096 MiB // smx: 6 { "index" : 0, "threads" : 38, "blocks" : 24, "bfactor" : 9, "bsleep" : 32, "affine_to_cpu" : false, "sync_mode" : 1, "mem_mode" : 1, Drum roll, Hash Rate is around 280H/s over all 6 cards 4 above 2 under.
This is a Pascal chip can I use the same Calc B=8*SM and T=2,4,or 8? Now what about sync_mode, changing to 1 cost me 50H/s on the CPU to gain 140+H/s on each Nvidia card, I have not tested 2 = cudaDeviceScheduleYield yet to see if that might work better.
v8 default config // gpu: GeForce GTX 1050 Ti architecture: 61 // memory: 3371/4096 MiB // smx: 6 { "index" : 0,
(================)"threads" : 8, "blocks" : 18, (================)
"bfactor" : 8, "bsleep" : 25, "affine_to_cpu" : false, "sync_mode" : 3, "mem_mode" : 1, Only 140H/s
start editing the values that you mark do not place the previous values that you have of the algorithm v7 since it only loses more hashrate
That's what I already did to arrive at my current running config the third one in my list is 280H/s right now on v8 algo. But the biggest jump was from using "sync_mode" : 1, and Hyper Threading, so i'm just wondering if I missed anything else like trying --openCLVendor NVIDIA with my current config?
When you use the --openCLVendor NVIDIA
then you edit the amd.txt
and the nvidia.txt
is then completely unused (assuming you also added --noNVIDIA
otherwise it will run one CUDA and one OpenCL thread on the same GPU which will likely choke hard).
Oh, on the first test I left the amd.txt in play and removed the nvidia.txt. Maybe I ran a batch file without --noNvidia because I did see a new nvidia.txt and it ran. It must have used that old AMD config for the test, not sure. Far to many tests and changes today.
When I remove both txt's and run the --noNVIDIA --openCLVendor NVIDIA it ignores the AMD card in the system. It does create an amd.txt with the nvidia card settings but not the AMD card settings and crashes hard. Can I still run the AMD card if I add the settings last, the first 3 cards are Nvidia.
If it still crashes what are the first things to change? Thanks.
OH, you have hybrid, that won't work correctly then. There is no support for running different platform indexes in one xmr-stak instance, so you'd have to choose one or the other, and thus you are stuck with using CUDA+OpenCL so it will do both.
Looks like I need to forget about the AMD card to test the Nvidia using OpenCL, I dropped the intensity to 512 for them and the miner starts but won't connect to any MoneroOcean ports SSL or not. I think I had something similar when I first started with the Nvidia cards, I'm close but I think I still need to change the config a little so it works. Time for some sleep.
Hello Spudz76!
Newer nVidia (maxwell+) tend to like B=8*SM and T=2,4,or 8 as best
I have 560 Ti and 650M, and I have 50% perfomance in _v8 mining if compare it with _v7. Do you know how to edit my configs?
"gpu_threads_conf" :
[
// gpu: GeForce GT 650M architecture: 30
// memory: 953/981 MiB
// smx: 2
{ "index" : 0,
"threads" : 68, "blocks" : 6,
"bfactor" : 2, "bsleep" : 0,
"affine_to_cpu" : false, "sync_mode" : 3,
"mem_mode" : 1,
},
],
Fermi cores are 50% speed in v8, no way around it. They are not good at the new math.
There will be a small NVIDIA fix in the next release (this week) maybe this will bring a few hashes back but over all FERMI and Kepler will lose around 50% of the v7 performance on v8.
I must wait for another update of xmr-stak?