MCUdude / MiniCore

Arduino hardware package for ATmega8, ATmega48, ATmega88, ATmega168, ATmega328 and ATmega328PB
Other
995 stars 245 forks source link

missing `setWireTimeout` support #178

Closed greyltc closed 1 year ago

greyltc commented 3 years ago

The Wire library here seems to be missing the newish timeout features now present in https://github.com/arduino/ArduinoCore-avr that came with the latest v1.8.3 release there. These changes let I2C comms work without the danger of locking up your whole firmware when something unexpected happens on the bus and solves tens (hundreds?) of strange bugs reported there over the last decade or so.

The new Wire library had its API extended to include the following new functions for managing the timeout feature.

    void setWireTimeout(uint32_t timeout = 25000, bool reset_with_timeout = false);
    bool getWireTimeoutFlag(void);
    void clearWireTimeoutFlag(void);

(as you can see in https://github.com/arduino/ArduinoCore-avr/blob/master/libraries/Wire/src/Wire.h)

Would be neat to see that stuff make its way over here since I just learned the hard way that I can't call setWireTimeout() here!

MCUdude commented 3 years ago

Yup, that's something I haven't implemented yet. In the other side, if your i2c lines are held low by some kind of fault, you're pretty much screwed anyways. I'll see what I can do.

One thing that differs from the official core is that MiniCore does not add the -fpermissive flag when code is being compiled. Basically, the -fpermissive flag converts some errors into warnings, and in many cases makes buggy and/or poor code pass compilation. IIRC this flag was added to the official core by a mistake, and now it's too late to remove it.

MCUdude commented 3 years ago

This turns out to be a little more complicated that first intended, since the Wire library that's bundled with MiniCore is written in a way so that the ATmega328PB that has two i2c port can use the Wire1 library while still maintaining the class name (TwoWire) for compatibility with existing libraries (see #150). Maybe @asukiaaa could help me out with this one?

greyltc commented 3 years ago

held low by some kind of fault

Unfortunately, in practice, it's much more fragile than that (I wish that's what it actually took to lock up the old Wire library!). Unrecoverable problems in the old Wire library are generally caused by some sort of glitch or anything on the bus not doing what the master expects[1], which causes execution to get stuck in one of the many while loops in twi.c. That Wire library state machine is super fragile and probably deserves a rewrite from scratch.

[1]: after looking at these lockups pretty closely myself, I came to the conclusion that I was seeing bugs in the AVR silicon that were triggering the lockups

greyltc commented 3 years ago

Wire1 library

What's that? Sorry, I'm not familiar. Just the same as the normal Wire library only with the registers redefined to make use of another TWI module that hardware has?

MCUdude commented 3 years ago

What's that? Sorry, I'm not familiar. Just the same as the normal Wire library only with the registers redefined to make use of another TWI module that hardware has?

This is the Wire1 library. It's only for devices that has two hardware serial port, ATmega324PB and ATmega328PB. @asukiaaa pulled off a clever "hack" that lets the Wire and Wire1 libraries share the same class name even though Wire1/TWI1 is a different peripheral. This means that both statements below are valid:

void MyLibrary::begin(TwoWire &WireObject)
{
  WireObject.begin();
}

// ...

MyLibrary lib;
lib.begin(Wire);
lib.begin(Wire1);

But modifications like this comes at a cost. Improvements such as timeout are more difficult to implement.

greyltc commented 3 years ago

I see. Well it looks like twi1.{c,h} is the same as twi.{c,h} except the former has ~60 1 characters placed around strategically. No actual changes to any logic (except one mask in the ISR). So it shouldn't be too hard to re-derive an updated Wire1 from an updated Wire library.

MCUdude commented 3 years ago

I see. Well it looks like twi1.{c,h} is the same as twi.{c,h} except the former has ~60 1 characters placed around strategically. No actual changes to any logic (except one mask in the ISR). So it shouldn't be too hard to re-derive an updated Wire1 from an updated Wire library.

You're absolutely right. I'm on it.

MCUdude commented 3 years ago

I've now added the timeout functionality to Wire and Wire1, and they both compile without any errors or warnings. I have however yet to test them.

There are a few downsides to all this. The first one is size. a blank sketch for a 328P with LTO disabled is now 470 bytes larger than before. This is a total no-go for smaller devices such as ATmega48/88. The other one is performance.

greyltc commented 3 years ago

Hmm. Sounds unfortunate. I wonder if we can find a way to fix this. I made the patch that added the timeout stuff to arduino/ArduinoCore-avr so maybe I can get modifications through, any ideas?

greyltc commented 3 years ago

Maybe we could add a build flag that could disable the timeout feature for smaller devices? -D DISABLE_WIRE_TIMEOUT?

greyltc commented 3 years ago

@MCUdude If I tried to make some mods to the timeout feature to make it more small mcu friendly, would you be able to help me test those?

MCUdude commented 3 years ago

I'm not quite sure what to do. On one side I've never needed the timeout functionality, and there are more than 1000 units running this code 24/7, and I've never seen any of them fail due to i2c bus issues.

On another side, I do understand that it's a nice feature to have, but it comes at a cost of ~470 bytes and more code/instructions to chew for the microcontroller. I might end up creating some kind of Wire_timeout library where the timeout functionality is added, but it might be better to have an #define WIRE_TIMOUT_ENABLE that we have to explicitly define to enable timeouts.

Alien-x commented 2 years ago

Hey, I'm dealing with the same problem - want to take advantage of setWireTimeout functionality (I'm working on bus with a lot of potentially unreliable slaves) to prevent master stuck (or reset). Any progress from last year? :)

abrahmx commented 2 years ago

I want to use setWireTimeout as well, how can I use it? I updated the arduino IDE, but still I can't use this parameter. Thanks in advance

greyltc commented 2 years ago

@MCUdude I think the 470 bytes price is worth it.

gregelectric commented 1 year ago

Any updates on the timeout feature?

MCUdude commented 1 year ago

I just added TWI timeout functionality, but I've not released a new boards manger version yet, so you'll have to install MiniCore manually for it to work. It's disabled by default to save memory (almost 500 bytes when compiled without LTO), but it can be enabled by modifying the Wire_timeout.h file.

In PlatformIO, it will be as simple as adding -DWIRE_TIMEOUT as a build flag.

gregelectric commented 1 year ago

@MCUdude I manually installed the new minicore and defined WIRE_TIMEOUT however I still get a compile error (error: 'class TwoWire' has no member named 'setWireTimeout'; did you mean 'setTimeout'?). I also tried to comment out the conditional compile statements in both wire.c and wire.h but no luck. I removed the original version of minicore so I am sure that I am using the correct version. Any ideas what I could be doing wrong?

MCUdude commented 1 year ago

@gregelectric did you uncomment line 3 in Wire_timeout.h?

https://github.com/MCUdude/MiniCore/blob/30d7dfd70d83f2adc984f823cde5237010656db5/avr/libraries/Wire/src/Wire_timeout.h#L3

That's all I need to do. I'm able to compile the sketch below with timeout enabled, but not when timeout is disabled:

#include <Wire.h>

void setup()
{
  Wire.begin();

  Wire.setWireTimeout(100000);
}

void loop()
{

}

Can you turn on verbose compilation under the IDE settings and post the output?

I also tried to comment out the conditional compile statements in both wire.c and wire.h but no luck.

Apparently, Arduino IDE is using a different Wire library somehow

gregelectric commented 1 year ago

@MCUdude it appears to be using the correct wire libraries

"C:\\Program Files (x86)\\Arduino\\hardware\\tools\\avr/bin/avr-g++" -c -g -Os -w -std=gnu++17 -fno-exceptions -ffunction-sections -fdata-sections -fno-threadsafe-statics -MMD -mmcu=atmega328pb -DF_CPU=8000000L -DARDUINO=10816 -DARDUINO_AVR_ATmega328 -DARDUINO_ARCH_AVR "-IC:\\Users\\grege\\Documents\\Arduino\\hardware\\MiniCore-master\\avr\\cores\\MCUdude_corefiles" "-IC:\\Users\\grege\\Documents\\Arduino\\hardware\\MiniCore-master\\avr\\variants\\pb-variant" "-IC:\\Users\\grege\\Documents\\Arduino\\libraries\\TimerOne" "-IC:\\Users\\grege\\Documents\\Arduino\\hardware\\MiniCore-master\\avr\\libraries\\EEPROM\\src" "-IC:\\Users\\grege\\Documents\\Arduino\\hardware\\MiniCore-master\\avr\\libraries\\Wire\\src" "-IC:\\Users\\grege\\Documents\\Arduino\\hardware\\MiniCore-master\\avr\\libraries\\Wire1\\src" "-IC:\\Users\\grege\\Documents\\Arduino\\libraries\\Adafruit_VL53L1X\\src" "C:\\Users\\grege\\AppData\\Local\\Temp\\arduino_build_345259\\sketch\\mcu.ino.cpp" -o "C:\\Users\\grege\\AppData\\Local\\Temp\\arduino_build_345259\\sketch\\mcu.ino.cpp.o"
C:\Projects\keycafe_firmware\pi_console\keycafe_console5_mcu\mcu\mcu.ino: In function 'void setup()':
mcu:185:9: error: 'class TwoWire' has no member named 'setWireTimeout'; did you mean 'setTimeout'?
   Wire1.setWireTimeout(1000, false);
         ^~~~~~~~~~~~~~
         setTimeout
Using library TimerOne at version 1.1 in folder: C:\Users\grege\Documents\Arduino\libraries\TimerOne 
Using library EEPROM at version 2.0 in folder: C:\Users\grege\Documents\Arduino\hardware\MiniCore-master\avr\libraries\EEPROM 
Using library Wire at version 1.1 in folder: C:\Users\grege\Documents\Arduino\hardware\MiniCore-master\avr\libraries\Wire 
Using library Wire1 at version 1.0 in folder: C:\Users\grege\Documents\Arduino\hardware\MiniCore-master\avr\libraries\Wire1 
Using library Adafruit_VL53L1X at version 3.1.0 in folder: C:\Users\grege\Documents\Arduino\libraries\Adafruit_VL53L1X 
exit status 1
'class TwoWire' has no member named 'setWireTimeout'; did you mean 'setTimeout'?
MCUdude commented 1 year ago

Ah, yes! I've not yet implemented timeout for the Wire1 library just yet. It's on my todo list though.

gregelectric commented 1 year ago

@MCUdude But I get the same error for wire as wire1. I assumed wire1 inherited the function?

mcu:210:8: error: 'class TwoWire' has no member named 'setWireTimeout'; did you mean 'setTimeout'? Wire.setWireTimeout(1000); ^~~~~~ setTimeout

MCUdude commented 1 year ago

Thanks for pointing this out. There's indeed an issue here. I'm currently working on a fix so that TWI timeout at least works for TWI0. I'll push the fix very soon!

MCUdude commented 1 year ago

OK, it's ready. Please try again with the changes provided in commit 8f76b98!

gregelectric commented 1 year ago

@MCUdude not sure if I did something wrong but I cloned the head of the repo to get the latest changes C:\Users\grege\Documents\Arduino\hardware\MiniCore\avr\libraries\Wire1\src\Wire1.cpp:21:58: error: no matching function for call to 'TwoWire::TwoWire(int, void (&)(), void (&)(), void (&)(uint8_t), void (&)(uint32_t), uint8_t (&)(uint8_t, uint8_t*, uint8_t, uint8_t), uint8_t (&)(uint8_t, uint8_t*, uint8_t, uint8_t, uint8_t), uint8_t (&)(const uint8_t*, uint8_t), void (&)(uint8_t), void (&)(), void (&)(), void (&)(void (*)(uint8_t*, int)), <lambda(uint8_t*, int)>, void (&)(void (*)()), <lambda()>)' [](){ Wire1.onRequestService(); }); ^ In file included from C:\Users\grege\Documents\Arduino\hardware\MiniCore\avr\libraries\Wire1\src\Wire1.h:4:0, from C:\Users\grege\Documents\Arduino\hardware\MiniCore\avr\libraries\Wire1\src\Wire1.cpp:5: C:\Users\grege\Documents\Arduino\hardware\MiniCore\avr\libraries\Wire\src/Wire.h:126:5: note: candidate: TwoWire::TwoWire(int, void (*)(), void (*)(), void (*)(uint8_t), void (*)(uint32_t), uint8_t (*)(uint8_t, uint8_t*, uint8_t, uint8_t), uint8_t (*)(uint8_t, uint8_t*, uint8_t, uint8_t, uint8_t), uint8_t (*)(const uint8_t*, uint8_t), void (*)(uint8_t), void (*)(), void (*)(), void (*)(uint32_t, bool), void (*)(bool), bool (*)(bool), void (*)(void (*)(uint8_t*, int)), void (*)(uint8_t*, int), void (*)(void (*)()), void (*)()) TwoWire(int bufferLength, ^~~~~~~ C:\Users\grege\Documents\Arduino\hardware\MiniCore\avr\libraries\Wire\src/Wire.h:126:5: note: candidate expects 18 arguments, 15 provided C:\Users\grege\Documents\Arduino\hardware\MiniCore\avr\libraries\Wire\src/Wire.h:93:7: note: candidate: constexpr TwoWire::TwoWire(const TwoWire&) class TwoWire : public Stream ^~~~~~~ C:\Users\grege\Documents\Arduino\hardware\MiniCore\avr\libraries\Wire\src/Wire.h:93:7: note: candidate expects 1 argument, 15 provided exit status 1 Error compiling for board ATmega328.

MCUdude commented 1 year ago

I think I figured it out. I've just pushed a commit (a3ea55e) that enables timeout support for TWI1 as well. Would be great if you could give it a try and report back!

gregelectric commented 1 year ago

@MCUdude it now compiles and runs but I have some work to do before I can confirm it works. I will update you once I have confirmed it works as expected. Thank you very much for your support!

gregelectric commented 1 year ago

@MCUdude I could have something else wrong with my code but it does not appear to timeout. Unfortunately I do not have jtag or convenient debug ability on my hardware. The desire was to have firmware continue to function if an I2C peripheral failed however with the peripheral disconnected and the timeout enabled the firmware still gets stuck during initialization. If the sensor is connected or the sensor code is commented out then firmware does not get stuck. Wire1.setWireTimeout(25000, false);

MCUdude commented 1 year ago

Thanks for testing. I didn't have time to test the code on an actual ATmega328PB last night, but I'll see if I can find my test board and give it a try.

@gregelectric can you check if timeout works with Wire when using the ATmega328PB? There is a separate implementation for Wire1, so this will help me track down the issue.

gregelectric commented 1 year ago

@MCUdude unfortunately wire is a slave on my current hardware. I will try to find a board to test with.

greyltc commented 1 year ago

@MCUdude I could have something else wrong with my code but it does not appear to timeout. Unfortunately I do not have jtag or convenient debug ability on my hardware. The desire was to have firmware continue to function if an I2C peripheral failed however with the peripheral disconnected and the timeout enabled the firmware still gets stuck during initialization. If the sensor is connected or the sensor code is commented out then firmware does not get stuck. Wire1.setWireTimeout(25000, false);

@gregelectric have you tried calling setWireTimeout with reset_with_timeout = true?

gregelectric commented 1 year ago

@greyltc I have tried that but found no difference. I am currently testing timeout on wire on new hardware and will try both options.

MCUdude commented 1 year ago

I can confirm that timeout work with ATmega328PB when using Wire. However, I'm not able to get it working when using Wire1 for some reason...

gregelectric commented 1 year ago

@MCUdude thank you for the update. I am still debugging my test setup, something strange is happening, MCU is resetting constantly after ~20ms. I believe it is a compiler issue because older hex files work while a new blink test does not. Never experienced this before O_o

MCUdude commented 1 year ago

@gregelectric I'm not sure why your program suddenly doesn't work anymore. But I'm pretty sure I've resolved the problem with Wire1 + timeouts! It was a very simple fix (2864a8a), but it took me at least an hour to track down because the function pointers were pointing to the TWI0 timeout functions and not the TWI1's.

I can confirm that it works with my setup at least!

gregelectric commented 1 year ago

@MCUdude that is excellent. I went back to my working hardware setup but unfortunately it still locks up if the sensor is not connected. Even worse is that if the working sensor is polled in the main loop the system will get stuck again and that triggers the WDT. Is it possible that there is some interference between wire and wire1? I use wire as a slave and wire1 as master in my application. Unfortunately I use wire to debug also because there is no serial interface which I regret.

MCUdude commented 1 year ago

Is it possible that there is some interference between wire and wire1?

It shouldn't happen, but I don't know what the compiler does under the hood. Have you tried with or without LTO?

gregelectric commented 1 year ago

@MCUdude I have tried both with the same result (typically I have LTO disabled).

gregelectric commented 1 year ago

@MCUdude I found the cause of sensor failure on wire1. When going to sleep PRTWI1 was being disabled to save power and never enabled again! I tried 3 different libraries and the 3rd one supported a timeout and finally worked.

Thank you very much for all your support. What would we do without minicore?

MCUdude commented 1 year ago

MiniCore v2.2.0 is now released, with setWireTimeout support!

gregelectric commented 1 year ago

@MCUdude you are awesome :)