Testato / SoftwareWire

Creates a software I2C/TWI bus on every pins
GNU General Public License v3.0
148 stars 33 forks source link

i2c_xxx_lo() and i2c_xxx_hi() are not interrupt safe #5

Open bperrybap opened 7 years ago

bperrybap commented 7 years ago

The macros i2c_xxx_lo() and i2c_xxx_hi() are not interrupt safe can cause register corruption when the pins being used are in a resister that is also being used by another library that modifies the register at ISR level.

This is definitely the case on the AVR, and depending on the core implementation might also be true on ARM and Pic32 cores (should the code be updated to support the register widths to support those architectures)

The problem is that the macros use this construct:

*hwptr &= ~mask;
*hwptr |= mask;

to clear & set bits. The AVR has very primitive bit set/clear h/w. (ARM & PIC32 have better h/w). The AVR h/w implemented bit set/clear operations as bit set/clear instructions vs as bit set/clear registers. The AVR bit set/clear instructions are atomic; however, they come with a MAJOR limitation. That is you have to know the register and bit to form the instruction. The avr-gcc compiler has a kludge in it that will use the bit set/clear instructions to create the AVR atomic bit set/clear instructions when possible. But in order for it to be possible, the port address and bit has to be known at compile time. If they are, the compiler will generate AVR sbi or cbi instructions. If they are not, the compiler will generate multiple instructions which consist of

This is an interruptible sequence.

This can and eventually does cause register corruption when another library modifies the same register being updated at ISR. One such library that does this ISR register updates is the servo library. (IR library is another)

Here is an example of one of the timing/sequence that will create the corruption:

  1. SoftwareWire code uses i2c_sda_lo()
  2. SoftwareWire reads port register into tmp storage
  3. Interrupt occurs
  4. Servo library ISR runs and modifies a bit in the same port register use by sda pin
  5. control returns back to Software
  6. SoftwareWire modifies the bit in the tmp storage
  7. SoftwareWrite writes the tmp storage back to the port

WHAMO! the port register just got corrupted! This is because the port register was modified underneath the i2c_sda_lo() macro so it never saw the modification so it wrote servo library bit back to the way it was before the ISR modified it.

This can be fixed, but it isn't pretty. Given that the register and pin are configured runtime, the register and bit mask are determined runtime, so there is no way to use the AVR atomic bit set/clear instructions. The only way to ensure atomicity is to mask interrupts during the operation.

Here are some possible solutions:

Either of this will impact performance.

A simple alternative might be to put the atomicity code in the macros but then have a define that could be turned on to the either enable/disable the atomicity code.

For example, the atomicity code might be put into each macro but then be disable by default and then if there is an issue, the user could turn on a define in the library header to enable atomicity. Likewise, it could be enabled by default and for those that need/want more performance they could set the define to disable it. The second option is the most robust but it does cause a performance impact even if when the atomicity code is not needed.

There is no easy "it just works" solution.

Testato commented 7 years ago

In my tests, on the initial lib version, when i working on the first fork of the lib, i increased the speed from the original 10kHz until 160kHz on a 328p@16MHz The classic speed of i2c is 100kHz, so i think we have room for atomic implementation. The second way, by a removable define, is my pteferred.

Another possibility is try to modify the hi-lo pin macro to standard arduino methods (digitalwrite). I know that the speed is very impacted by digitalwrite, but we reach an incredible portability, so SoftwareWire will work immediately on all arduino core, included Arm, Mips, etc. If by using digitalwrite() we mantain at least 100kHz on a 328p, i thinks it can be a good solution.

bperrybap commented 7 years ago

You could make the declarations and macros smarter. i.e. they know about certain processors and do things specific to those processors but then fall back to a generic mode to allow portability across all processors. I've done this before. The method of indirect port i/o that the current macros is using actually works on all the processors. The issue is that the register and bit mask widths is not the same. AVR uses 8 bit and ARM and Pic32 use 32 bit. Unfortunately, there is no easy way to auto detect this given the stupid way the current Arduino macros and port/bit lookup tables are defined/declared. The code can be smart enough to pick the proper type/width for each processor based on processor type and then fall back to arduino digitalxxx() routines processors that don't have specific code.

Alternatively, to keep it a bit simpler, it could do what it is doing now for AVR, but have a fallback mode that uses digitalxxx() routines when not on avr.

The digitalXXX() routines don't have near the overhead on the other processors as the AVR so it is pretty decent compromise.

Koepel commented 7 years ago

@bperrybap, thank you for mentioning this. We would like to have 100kHz I2C for AVR chips. I prefer to use the AVR sbi and sti instructions ( http://www.nongnu.org/avr-libc/user-manual/group__avr__sfr.html ), but I don't know how to implement that for every pin. It might be fun to have the possibility for a faster I2C on specific pins as an extra, for example with fixed pins.

The digitalWrite(), pinMode() and digitalRead() could be a fall back. For other processors the OneWire.h has macros and inline functions that we can use. What if we include the file OneWire.h ( https://github.com/PaulStoffregen/OneWire/blob/master/OneWire.h ) in our library, and disable the interrupts every time the macros or inline functions are used ?

bperrybap commented 7 years ago

You can't use cbi/sbi instructions when the register and bit are not known at compile time. There are some games you can play to use lookup tables that can jump to a function that does the sbi/cbi instruction, and paul does some of that in his teensy version of digitalWrite() when the parameters are not compile time constants. However, there are some cases where even if you do that you can't guarantee atomicity since the register is outside the 0-255 address space so cbi/sbi instructions are not possible. This happens on the MEGA on some of the higher numbered pins.

The stuff in the OneWire header is what I was talking about where you set the width of the register based on the processor. This can work, but what makes it very messy and not full proof is that in some cases you really want to pick the width based on the Arduino core rather than the processor type and the Arduino IDE does not create a define for the Arduino core so you have to resort to using processor defines which can have issues when there are multiple cores for a given processor or processor type. (BTW, there are other defines you can use to determine the processor type or family, rather than check for specific processor, i.e. ARM etc...) I have been asking for an Arduino core define for more than 8 years but I've never been able to get the arduino team to buy into it. The latest IDEs do provide some of this type of information but the way they did it makes not easy to use for this type of stuff. They created a separate define for each one, vs a single define that contains the information.

What you have proposed (using a methodology of picking the register width based on processor) with masking interrupts, but also having a fall back to digitalWrite() (which I call Arduino core code mode - in my libraries that do this kind of thing), will work. However, I'd suggest doing some testing first to see if is worth the effort and the ongoing support. The reason being that there is much less improvement when using indirect port i/o than on the AVR. It would be useful to get the fallback mode working, and then measure the bit rate using a logic analyzer to see if it really needs the extra code to do the indirect port i/o on the non AVR processors.

I've done lots of looking at this kind of stuff over the years. In my openGLCD library which does lots of pin twiddling to run a GLCD, I opted not to do indirect port i/o for the non AVR processors.

But in fm's new LiquidCrystal library, I added the code to allow doing indirect port i/o for ARM and Pic32 with a fall back to arduino core code mode.

Having been through this over the years, I can say you will spend lots of time looking at compiler output and assembler listings and there will be surprises. The best thing you can do is to have a fall back mode so that for any case that you don't explicitly support, the code can fall back gracefully and still work. That and a logic analyzer will quickly become your best friend to look at timings.

In my openGLCD library I provide an override define that can be used to force the code to use the Arduino routines. This is an emergency switch that a user can flip if they are running into an odd situation that "breaks" things.

Koepel commented 7 years ago

These are my fall back macros. Perhaps the OneWire.h has not the ideal functions, since the I2C can be defined with or without internal pullup resistors.

define i2c_sda_lo() digitalWrite(_sdaPin, LOW); pinMode(_sdaPin, OUTPUT)

define i2c_scl_lo() digitalWrite(_sclPin, LOW); pinMode(_sclPin, OUTPUT)

define i2c_sda_hi() pinMode(_sdaPin, _pullups ? INPUT_PULLUP : INPUT)

define i2c_scl_hi() pinMode(_sclPin, _pullups ? INPUT_PULLUP : INPUT)

define i2c_sda_read() digitalRead(_sdaPin)

define i2c_scl_read() digitalRead(_sclPin)

bperrybap commented 7 years ago

If you create a branch for this development, I'll be happy to test it out on Teensy 3.0/3.1/3.2/LC & Chipkit Uno32 with my hd44780 library.

Koepel commented 7 years ago

There are some issues, for example the repeated start does not work. I started to fix that with a test version on my own computer, but then the clock duty cycle got bad. Testato worked so hard on that, to be able to make it as fast as possible.