luke-jr / bfgminer

Modular ASIC/FPGA miner written in C, featuring overclocking, monitoring, fan speed control and remote interface capabilities.
http://luke.dashjr.org/programs/bitcoin/files/bfgminer/
Other
1.85k stars 816 forks source link

4.3.0. pool diff varies from 488m to 88 in an instant, no accepts. #485

Open DennisFantoni opened 10 years ago

DennisFantoni commented 10 years ago

I am running bfg 3.10.0 as a test against the same pools as i am running bfg 4.3.0 3.10.0 does not have this bug.

right now, i am running 3 computers, all using the same username and the same p2poop (dtchpool.org:3832) mining aliencoins.

The bfgminer display on one of the computers show Pool 0 diff 88 while o n the other two it shows pool 0 diff 122 the block rate is shown as diff 0 (564.0kh?) on all 3

And this is the problm... bfgminer sometimes on what seems like random computers sets pool difficulty to some huge number like 88 while the pool diff isat below 1 this means that bfgminer does not produce any accepts and that no hashes gets reported to the pool. I am not evensure if bfg will detect a block is found, when the diff is set 88 + times harder than the network diff rate.

The pc that gets the way too high diff number is running an 6950 and another not so fast gpu. the other two are running slower gpus.

sometimes the other two also gets way too high diff values.

as soon as a computer gets a high diff value it stops accepting hshes, so its not a display issue.

I have noticed the probelm on both dutchpool.org:3832 and on p2pool.name:9241

If you need an aliencoin addr for testing mine is AYxc38scnPPUJyLdtJmh5ppjM2dunm9hXy

i start bfgminer like this

setx GPU_MAX_ALLOC_PERCENT 100 setx GPU_USE_SYNC_OBJECTS 1 set POOLURL=dutchpool.org:3832 set USER=AYxc38scnPPUJyLdtJmh5ppjM2dunm9hXy set PASSWORD=aliencoin c:\mining\bfgminer-4.3.0-win64\bfgminer.exe -S opencl:auto --scrypt -o stratum+tcp://%POOLURL% -u %USER% -p %PASSWORD% --intensity=18,10 -g 1 --worksize=256 --thread-concurrency=6144 --shaders 1536,480

when i run 3.10 i get like 800Kh/s and with 4.3 i get none,

diff changes around btw. now it is 117

i think bfg somehow misreads the diff value that is sent from the pool

to summarize bug seen on 3 different windws 7 64 pcs bug seen working with 2 different pools, using different kinds of pool software bug goes away when going back to 3.10 bug makes using bfg futile, as when diff goes 117 no shares are created unless i am really really lucky with the numbers. it takes like 600m to find a block but the diff bfg 4.3 sets for itself is 117 which is way too much

if i use +1 or +0.1 or +0.01 or +0.001 or +0.0001 after the username i get all kinds of weird diff values, many of them way way way way higher than 1

perhaps it is a bug that only shows with really easy to mine coins where diff is below 1

luke-jr commented 10 years ago

Looks like a pool issue to me. dutchpool.org:3832 gave me diff 32 at least. This might have worked with 3.10 due to a bug which was fixed in 4.0

DennisFantoni commented 10 years ago

Is there a way for me to get bfg to write log files to show to the pool software developers to aid them in finding the bug?

with the same setup as above, i have the same problem using a different kind of p2p pool software

set POOLURL=p2pool.name:9241

the dutchpool is running nomp, while p2pool.name is running p2pool. Not sure how related the two products are source wise, but it seems strange that they are both broken.

seems like when i start up my bfg it initiallis set to 488u right now, then after about a minute, depending on luck with share, it is reset to diff 80 or 100 or more. after that, no hashes are found.

perhaps bfg could ignore if it is set to have a share diff that is more difficult than the network diff? it seems like (but i am not 100% sure here) bfg does not submit found blocks if they are below the pool diff, even if they would qualify for a new blockchain block for the coin.

luke-jr commented 10 years ago

--debuglog --protocol-dump --log-file debug.log

jfabritz commented 10 years ago

On an aside, would it make sense to have a sanity check that compares the stratum difficulty to the nework difficulty and flags an error if the stratum > network?

I see a similar behavior using gridseed blade miners with 4.1.0 where I ended up with a 1.76k stratum difficulty and the network difficulty was 6. I will work on capturing debug that will help decode the problem.

jfabritz commented 10 years ago

Here was a non-debug TUI capture using 4.2.1 for Zeus on Gridseed blades where stratum was greater than network:

bfgminer version 4.2.1 - Started: [2014-07-03 17:19:25] - [ 0 days 00:04:09] [M]anage devices [P]ool management [S]ettings [D]isplay options [H]elp [Q]uit Pool 0: ...opoolmining.com Diff:153 +Strtm LU:[17:22:46] User:xxxxxxxx.xxxxxxx Block: ...176615fbe24d5c8e Diff:6 (48.09M) Started: [17:22:46] ST:8 F:0 NB:2 AS:0 BW:[ 88/242 B/s] E:0.01 I: 0.00 BTC/hr BS:85m

6 | 16.98/16.72/ 4.37Mh/s | A:504 R:0+0(none) HW:9/none

GSD 0: | 2.84/ 2.81/ 0.71Mh/s | A: 85 R:0+0(none) HW:0/none GSD 1: | 2.84/ 2.79/ 0.77Mh/s | A: 92 R:0+0(none) HW:2/none GSD 2: | 2.84/ 2.81/ 0.51Mh/s | A: 76 R:0+0(none) HW:2/none GSD 3: | 2.84/ 2.83/ 0.57Mh/s | A: 85 R:0+0(none) HW:1/none GSD 4: | 2.84/ 2.81/ 0.49Mh/s | A: 73 R:0+0(none) HW:0/none

GSD 5: | 2.84/ 2.79/ 0.63Mh/s | A: 93 R:0+0(none) HW:4/none

jfabritz commented 10 years ago

Here is a transition that occured where the difficulty changed another time, captured from non-TUI logging on the screen -- see where the difficulty got set to 775.28 when it was 15m. I was not able to meet the target criteria after that:

[2014-07-03 21:10:47] htarget 03ffffff hash 0000003a [2014-07-03 21:10:47] Proof: 0000003ad03cfe715fbe6b03bd04f4feb94fa7f90fc71022dfd6cd63260da8ec Target: 0000003fffffffffffffffffffffffffffffffffffffffffffffffffffffffff TrgVal? YES (hash <= target) [2014-07-03 21:10:47] Pushing submit work to work thread [2014-07-03 21:10:47] DBG: sending east-us.cryptopoolmining.com submit RPC call: {"params": ["jfabritz.gblades", "2c5f", "02000000", "53b5fee7", "d30acdf9"], "id": 2483, "method": "mining.submit"} [2014-07-03 21:10:47] PROOF OF WORK RESULT: true (yay!!!) [2014-07-03 21:10:47] Accepted 003ad03c GSD 1 Diff 17m/15m [2014-07-03 21:10:47] Successfully submitted, adding to stratum_shares db [2014-07-03 21:10:48] [thread 0: 11467735 hashes, 2841.6 khash/sec] [2014-07-03 21:10:49] [thread 5: 11462050 hashes, 2842.0 khash/sec] [2014-07-03 21:10:50] [thread 2: 11462050 hashes, 2841.8 khash/sec] [2014-07-03 21:10:51] [thread 3: 11501849 hashes, 2841.8 khash/sec] [2014-07-03 21:10:51] [thread 4: 11459207 hashes, 2841.7 khash/sec] [2014-07-03 21:10:51] [thread 1: 15774533 hashes, 2842.1 khash/sec] [2014-07-03 21:10:52] [thread 0: 11464893 hashes, 2841.6 khash/sec] [2014-07-03 21:10:53] [thread 5: 11464893 hashes, 2841.6 khash/sec] [2014-07-03 21:10:54] Generated target 000000000054883400ffffffffffffffffffffffffffffffffffffffffffffff [2014-07-03 21:10:54] Pool 0 stratum difficulty set to 775.28 [2014-07-03 21:10:54] Received stratum notify from pool 0 with job_id=2c60

luke-jr commented 10 years ago

I agree some sanity checks would be helpful.

jfabritz commented 10 years ago

I caught a log of my issue, so how do I get a copy to you for your investigation?

I took two logs. One that was good, and one with the bad transition. The bad transition started out with the set_difficulty of 32. The next one that came was a decimal value of 182.85714286, which apparently was translated into a real 182 difficulty.

The good log started at 32, then jumped to 1024, which translated to 15m?

I saw in the source that there are two different diff calculations to allow the use between SCRYPT and SHA-256.

nwoolls commented 10 years ago

It looks related to the same p2pool code I am looking at. If a pool specifies a fractional difficulty (183.85 in your case), BFGMiner is using that specifically for pdiff regardless of algorithm.

If the pool sends an integral number, like 1024, and your a mining with the Scrypt algo, that number is divided by 65536.

The difficulty comes with the fact that, from what I am seeing, some Scrypt servers return a number they intend to be divided by 65536, and some do not. And BFGMiner is getting this wrong in some cases (most likely a case of a server that doesn't follow the spec).

luke-jr commented 10 years ago

@nwoolls It should be bdiff regardless. The integer exception for scrypt is to workaround bugs in older pools - that exception should be removed sometime in the future when all pools have been fixed (and not extended to cover NEW pools with bugs).

nwoolls commented 10 years ago

@luke-jr understood - the problem I am seeing (not a BFGMiner problem) is that most p2pool nodes I've found aren't fixed (yet). The PR is there but they still send a fractional value multiplied by 0x10000.

jfabritz commented 10 years ago

I was looking at the NOMP source for the stratum-server and saw that it could generate a fractional value. The code that passed back the new vardiff value used toFixed() out to 8 decimal places. For my experiment, I set the decimal places to 0 and the fractional issue went away.

Maybe you can add a flag to the command line of bfgminer that would when --scrypt is set, force all diff values to be divided by 2^16? That way it is only used on strange pools and will behave appropriately on fixed ones?

luke-jr commented 10 years ago

Better to just fix the broken pools...

jfabritz commented 10 years ago

While I'd like to be an optimist about that, it would be like moving heaven and earth to get all of those people with pools based on incorrect code to take an update. It's easier to deal with it via a flag in bfgminer for the short term while developers slowly integrate the fixes into the pool's codebase and operators eventually update their copies.

nwoolls commented 10 years ago

@jfabritz please see https://github.com/luke-jr/bfgminer/pull/516

jfabritz commented 10 years ago

Thanks Nate! Do you have a Windows binary perhaps that I can play with? I'm still fighting my way through getting my environment properly set up to compile all the way through. ;-)