luke-jr / bfgminer

Modular ASIC/FPGA miner written in C, featuring overclocking, monitoring, fan speed control and remote interface capabilities.
http://luke.dashjr.org/programs/bitcoin/files/bfgminer/
Other
1.83k stars 810 forks source link

KnCMiner i2c issue on November batch #340

Open jaketri opened 10 years ago

jaketri commented 10 years ago

I'm using bfgminer on my October Jupiter for more than a month.

I tried to switch to bfgminer on my November Jupiter but I got very low hashrate (~14GH instead of ~640GH).

I guess the root cause is some kind of i2c bus hung when bfgminer start because I got following kernel errors in the syslog:

... [ 3049.015602] omap_i2c 4802a000.i2c: controller timed out [ 3050.019546] omap_i2c 4802a000.i2c: controller timed out [ 3051.023495] omap_i2c 4802a000.i2c: controller timed out [ 3052.029383] omap_i2c 4802a000.i2c: controller timed out [ 3053.033327] omap_i2c 4802a000.i2c: controller timed out [ 3054.037274] omap_i2c 4802a000.i2c: controller timed out ...

The only way to recover from this is to reboot the miner.

jaketri commented 10 years ago

I notice some i2c error in the bfgminer log:

[2013-12-04 00:34:17] setrlimit: Soft fd limit not being changed from 1024 (FD_SETSIZE=1024; hard limit=4096) [2013-12-04 00:34:17] Started bfgminer 3.7.0 [2013-12-04 00:34:17] Loaded configuration file /config/bfgminer.conf [2013-12-04 00:34:17] knc_detect_one: 0x26: Failed to read i2c block data [2013-12-04 00:34:17] KNC 0aa: Set temperature config: target=89 cutoff=95 [2013-12-04 00:34:17] KNC 0ab: Set temperature config: target=89 cutoff=95 [2013-12-04 00:34:17] KNC 0ac: Set temperature config: target=89 cutoff=95 [2013-12-04 00:34:17] KNC 0ad: Set temperature config: target=89 cutoff=95

jaketri commented 10 years ago

The voltage and amperage look totally wrong .. for example voltage supposed to be < 1V not around 6 V ...

[2013-12-04 00:34:19] KNC 0bq being disabled [2013-12-04 00:34:19] KNC 0dy being disabled [2013-12-04 00:34:19] KNC 1: die 0 6.674V 9.59A [2013-12-04 00:34:19] KNC 1: die 1 6.733V 9.08A [2013-12-04 00:34:19] KNC 1: die 2 6.719V 10.12A [2013-12-04 00:34:19] KNC 1: die 3 6.716V 1023.50A [2013-12-04 00:34:19] KNC 1bl being disabled

jaketri commented 10 years ago

Please let me know if there are other experiments or data that I can collect to identify where is the issue.

pookunui commented 10 years ago

Hi mate

I just updated bfgminer to v3.8 on Linux mint and on load/balance my 30ghs bfl single is running at 10gh/s. Sorry to bother you but I'm not technically inclined. It was a challenge to upgrade from earlier version! Cheers

Regards

Paul

----- Reply message ----- From: "Jake" notifications@github.com To: "luke-jr/bfgminer" bfgminer@noreply.github.com Subject: [bfgminer] KnCMiner i2c issue on November batch (#340) Date: Wed, Dec 4, 2013 20:07 The voltage and amperage look totally wrong .. for example voltage supposed to be < 1V not around 6 V ...

[2013-12-04 00:34:19] KNC 0bq being disabled

[2013-12-04 00:34:19] KNC 0dy being disabled

[2013-12-04 00:34:19] KNC 1: die 0 6.674V 9.59A

[2013-12-04 00:34:19] KNC 1: die 1 6.733V 9.08A

[2013-12-04 00:34:19] KNC 1: die 2 6.719V 10.12A

[2013-12-04 00:34:19] KNC 1: die 3 6.716V 1023.50A

[2013-12-04 00:34:19] KNC 1bl being disabled

— Reply to this email directly or view it on GitHub.

hno commented 10 years ago

There is a slight bug in the I2C interactions between that revision of the FPGA code and the Ericsson DCDC modules. The workaround is to wait a little between i2c accesses. Unfortunately you need to reinitialize the FPGA (i.e. reboot, or stop everything and rerun initc) when this happens.

This is fixed in the FPGA image found in next firmware release which should be released in some days I hope.

jaketri commented 10 years ago

Thank you! Adding 10 ms sleep before both i2c readings for VOUT and IOUT fix the problem.

jaketri commented 10 years ago

Look like a wait between i2c just delay the issue.

Without any wait, problem show up right away when starting bfgminer. After adding wait before every i2c access bfgminer run fine for about 1 day then same problem.

I can only hope that updated firmware with the new FPGA image will be released soon :)

hno commented 10 years ago

ons 2013-12-11 klockan 11:19 -0800 skrev Jake:

Look like a wait between i2c just delay the issue.

10 ms is a little short. You need a wait AFTER I2C accesses to the Ericsson modules sufficienlty long to guarantee that the Ericsson module DCDC I2C interface have returned to idle state.

I can only hope that updated firmware with the new FPGA image will be released soon :)

Unfortunately it seem the issue is not 100% solved yet, and also issues with some new interesting functionality needed for next release and key people right now at a small and much deserved vacation.. so not tomorrow. Maybe end of next week.

Regards Henrik

jaketri commented 10 years ago

Thanks again for the quick reply. Hmm interesting revelation about new functionality in the next firmware :)

I made a new bfgminer test build without any wait and where I completely removed the i2c access to DCDC module. If this test build did not show the problem in 1-2 days then it is clear that problem is definitely related to the I2C access related to DCDC modules.

So far the test build show promising results ... everything worked fine for 15 minutes :) ... I'll let it run and see ...

If I get again the bus hang I'll update this thread right away.

Thanks, Jake

etree commented 10 years ago

Hey, Jake.. could you provide us a working patch?

hno commented 10 years ago

ons 2013-12-18 klockan 22:43 -0800 skrev Jake:

See following change for the patch to disable I2C access to DCDC module: jaketri@6e93450

So far no I2C hang for the past 6 days.

Right.. having both monitordcdc and bfgminer trying to access the modules over i2c will eventually trigger the issue even if both of them carefully delay between accesses.

Regards Henrik

jaketri commented 10 years ago

I hope new 1.0 firmware will have a proper fix in the FPGA image. Without the fix bfgminer will not be able to report proper voltage and amperage from the DCDC module.

My workaround just hard coded the value reported from bfgminer to 1V / 1A

luke-jr commented 10 years ago

@hno Reviewing this issue, I don't see a clear resolution. At the very least, is there a way BFGMiner can detect a problematic unit? What happened to the FPGA-side fix?