neuoy / OneWireArduinoSlave

An arduino library to communicate using the Dallas one-wire protocol, where the Arduino takes the role of a slave. Entirely implemented using interrupts, you can perform other tasks while communication is handled in background.
58 stars 19 forks source link

Presence bug - Raspberry Pi #8

Closed goial closed 7 years ago

goial commented 8 years ago

Hello @neuoy , I made a OneWire Slave with your Code and added some code for measurements. It worked very well over a long time(i think endless) with a Arduino as OneWire Master.

In my Project is a RaspberryPi as OneWire Master and that works too (Why not?). BUT the Slave get stuck some time. With the original OneWire Slaves, that does not happen. Without my code added to your "OneWireArduinoSlave" the Problem is the same. Only with a RaspberryPi as Master!

I tried to trigger the error, but I did not succeed. When the RaspberryPi only scans the bus for slaves, it worked without Problems for days (I tested it for ~30 hours). When the raspberry sent the slave a request (measure, read, write), enters the bug spontaneously. Sometimes after one request and sometimes after 100 requests. When i restart the slave it's working again for some time.

The bug looks like a Presence fail, because the Master pulled down the bus and release them after 480µS - 850µS. After the rising edge and 30-35µS the Slave pulled down the Bus and hold them for 296µS low. Then the magic begin:

  1. If there was no requests on the Bus, the Slave release the Bus aufter 296µS. All ok.
  2. If there was a request () on the Bus previously, the Slave pulled down the bus to show presence and after 296µS the Bus is still low, because the slave hold them low infinitely long. After a Hardware Reset at the Slave, it´s working again until next Presence fail.

My first Idea was the long reset pulse from master, but with various reset pulses from an arduino it worked too. The ArduinoSlave does not freeze when the bug appear, because my extra code run continues without problems when the fail happened.

Write me if I can do something for you to find the Issue (measurements, code changes... I think i can do every measurements what you want)! I will add later a picture of the bug.

greeting goial

neuoy commented 8 years ago

Hi goial,

Thanks for your bug report.

Since you say your code is still running, this means the arduino did not crash or enter an infinite loop. The only thing I can imagine is that it is stuck doing nothing, with no timer or pin interrupt registered. How this can happen is unclear to me though. Since it's difficult to trace the sequence of events due to timing constraints (logging to the serial port is too slow), and even harder since the bug seems difficult to reproduce, I don't really know what the next move can be.

I don't have a RaspberryPi at hand to try to reproduce the issue. But I do have two arduinos (or is it "two arduini" ?) so I could try to reproduce the issue by tweaking a master library to have a longer reset (if I understand correctly you are suspecting that's the difference that make it fail with a Pi). But I really can't make any promise as to when that would be (like, maybe tonight, maybe in two years...).

Also, after reading my code again, I think I should make some charts about the various interrupt and timer sequences because currently it's hard to follow... Maybe the bug would be easier to find by reviewing the code carefully, than by trying to reproduce it, I don't know.

FYI, I'm using a DS9490R master adapter here (almost the cost of a RaspberryPi! but I bought it before the Pi even existed). And I never observed my arduino stuck pulling low, even after almost one year of continuous service (it's managing my boiler installation for house heating). Though, now you mention it, I did have to unplug/replug my master adapter on two occasions, for unknown reasons, maybe that was the same issue, hard to say. If that happens again I'll check the bus state before resetting it all.

Oh and one last thought : can you modify the Pi master library to change the reset duration? That could work around the issue for you, and also confirm the bug is triggered by a particular reset duration...

neuoy commented 7 years ago

Hi goial,

Commit https://github.com/neuoy/OneWireArduinoSlave/commit/a108089f02cabd236a53bbbccb96b730fe0f7b50 might fix your issue.

I'm closing this issue now, but if you have the opportunity to test again (with the fix), and see that it doesn't work, let me know so I reopen the issue.