arduino / ArduinoCore-avr

The Official Arduino AVR core
https://www.arduino.cc
1.24k stars 1.05k forks source link

I2C/Wire library: Make Wire library non-blocking #42

Closed wmarkow closed 2 years ago

wmarkow commented 6 years ago

This is a placeholder issue to cover an arduino/Arduino/issues/1476 improvement.

It looks like arduino/ArduinoCore-avr repository is the correct one to cover that case, doesn't it?

Aleev2007 commented 3 years ago

@Greyltc Hi, about other places - Ctrl-C, Ctrl-V :-D At the expense of everything else - WHY? Ideally, this line of code should NEVER be executed to the end, just checking the time. Why engineering units? Are you going to measure something in timeouts? ))) Why setting a timeout? Does it make a difference when you get an error? ))) It's just a hardware library. The user should not think about what should not happen in principle. But writing infinite loops is not good! )))

greyltc commented 3 years ago

This I2C library could probably benefit from a rewrite from scratch. Why don't you give it a try?

Aleev2007 commented 3 years ago

@Greyltc I can try. I need a link to the latest version without a timeout.

bperrybap commented 3 years ago

@Aleev2007 If you clone the repository, you should get the full branch & commit history to be able to see everything.

Aleev2007 commented 3 years ago

@bperrybap Well there is no way ))

per1234 commented 3 years ago

One line of code in the library is enough. )))

That's not one line of code. That's 4 lines you mashed all together. Please don't do that.

Well there is no way ))

Git is very well documented. This is not the appropriate place for us to mentor you in its use. You are welcome to ask for assistance on the Arduino Forum. I'll be happy to help you out over there. https://forum.arduino.cc/

Aleev2007 commented 3 years ago

I looked carefully at the whole wire and twi code. I can't write you better. The only advice I can give is that the timeout should be hardcoded, as I mentioned above. This will increase productivity. The thing is that this library has big design problems. Let me explain with an example: Let's imagine SDA and SCL are rails. TWI Unit is a train. ISR (TWI_vect) is a working station. twi.h and twi.c is a train station dispatcher. wire.h and wire.cpp is a shipping agency. txBuffer and rxBuffer are two different warehouses with shipping agencies. UserAplication.ino is the customer. The customer calls the agency. He says I have a warehouse. there are 10 cows on it. They need to be sent to the UserAplication2 station. The agent replies, okay. But we don’t carry that much, and in general you must transport all your goods to our warehouse. And divide it into carriages of 1/4 of a cow, and more than 8 cows will not fit us. Customer, approx. I brought everything to your warehouse. Submit. The agent is good. We send. Wait ... The agent calls the dispatcher at the station. We have cargo here at the UserAplication2 station, prepare the train. You will pick up everything from our warehouse. And you are already sending the first car. The dispatcher announces to the entire station, urgently load 1/4 of the cow into one carriage and send the train. Report when ready ... The workers at the station - hell, there is no train yet, since last time. Let's go wait on the platform ... At this time the Train - damn it, the rails are broken, I can't get through ... At this time, the Customer - damn it !!! How long can you wait !!! After a while, Dispatcher, I'll go and look at the clock at the STATION. It seems that the Workers take a long time to load 1/4 of the cow. Leaves ... ... is coming back. And for sure, a long time. Probably, the rails broke again, or I was in a hurry, or maybe there were flashes in the sun. The Agent picks up the phone and answers - the cargo is not loaded and not sent, for a long time. The Agent informs the Customer, the train was delayed, they could not send anything, let’s start all over again. By the way, you can choose the waiting time. Bring the cows, while we reload the train here ... Customer 8- (

Sorry for my bad english :)

bperrybap commented 3 years ago

As is always the case, more functionality and more robustness often comes at a cost. It can often be any combination of code size, data size, or performance.

I would caution everyone not to dismiss Aleev2007's concerns over i/o throughput/performance The updated Wire code does impact i/o performance and can potentially have a noticeable impact on i/o performance. The amount of degradation will vary depending the application. And there is a degradation of i/o performance even when the timeout is not enabled i.e. when not using timeouts, the code is slower with no new functionality.

I just did a test to see how the new Wire code affects single byte reads & writes. I used a simple sketch that does multiple (1000) single byte operations to get an average byte i/o time for both reads and writes using a PCF8574. For more accurate numbers a logic analyzer should be used - I'll probably do that later and also compare the AVR to other processor implementations at the actual h/w level.

This initial test comparison was for IDE 1.8.12 vs 1.8.13 using a 16Mhz AVR "uno" type board hooked up to a PCF8574. I tested 1.8.13 with the new loop timeout disabled, which is the default, and then when enabled since there is more overhead when loop timeouts are enabled. I also set the i2c SCK clock speed to 400 kHz

Here is the timing: Single byte Write timing to PCF8574 using 400kHZ SCK

1.8.12    1.8.13(no timeout)   1.8.13(with timeout)
77us      84us                 96us

Single byte read timing from PCF8574 using 400kHZ SCK

1.8.12    1.8.13(no timeout)   1.8.13(with timeout)
118us     129us                154us

What you see is that even when the loop timeout is not enabled, single byte writes slowed down 9% When timeouts were enabled single bytes writes slowed down nearly 25% For reads it is a bit worse. When loop timeout is not enabled, single byte reads slowed down 9.3% When timeouts were enabled, single byte reads slowed down nearly 31%

The amount of noticeable performance hit will depend on the application. Applications that do many single byte i/o operations will be affected the most, particularly when the timeout is enabled.

As another point of comparison, for an actual real world test I used the LCDiSpeed test from my hd44780 library. For this particular test I used a LCD with a PCF8574 backpack. The test measures the actual time it takes for a byte to travel from the sketch, through the Print class, through the hd44780 library, the hd44780_I2Cexcp i/o class, then through several writes through the PCF8574 to get a single byte to the LCD.

1.8.12    1.8.13(no timeout)   1.8.13(with timeout)
199us     205us                219us

The write throughput to the LCD slowed down about 3% when going from 1.8.12 to 1.8.13 with timeouts disabled. When the timeouts were enabled it slowed down about 10%


I think we need to be careful that the added overhead of doing the timeouts is worth it. For some applications it may not be particularly since there is no way for the sketch to eliminate it to get back to the timing before the timeout code was added. My concern would be for things like i2c based graphic LCDs since graphical updates can be very i/o intensive.

I think there may be a way to tweak the implementation for a compromise which may satisfy both ends of the spectrum at the expense of a small bit of code size. I believe that the code could be re-written to optimize the path when timeouts are not implemented to get the timing back to very close to what it was before the timeouts were added. It would require changing the code to split out each loop into multiple loops so that some of the checks and calls are only done once before / upfront to select which loop to use vs always being done including inside the loop that handles the timeout. Also, perhaps there could be an option to provide simplistic non configurable timeouts similar to what Aleev2007 is asking for.

effectively 3 "modes" that requires 3 loops for each current loop.

At a minimum, it seems reasonable to modify the code to split the code into two loops with / without timeouts so that the i/o timing without timeouts enabled returns back to very close to what it was before the timeout code was added.

bperrybap commented 3 years ago

Note: the second set of i/o numbers (the slightly slower ones) were for single byte reads not writes. I updated the comment on the github page.

Aleev2007 commented 3 years ago

@bperrybap

As is always the case, more functionality and more robustness often comes at a cost. It can often be any combination of code size, data size, or performance.

I can suggest a design option that can achieve greater functionality, performance and reliability. And most importantly ease of use. The user only needs 6 basic Wire functions:

bool Wire_is_busy();
void Send_Master_to_Slave(sleve_address, *send_data, send_data_size);
void Reqest_Master_from_Sleve(sleve_address, *receive_data, receive_data_size);
void Send_and_Reqest_Master(sleve_address, *send_data, send_ data_size, *receive_data, receive_data_size);
void Receive_Slave_from_Master(*receive_data, receive_data_size);
void Answer_Sleve_to_Master(*send_data, send_data_size);
int get_Wire_error();

Wire.cpp prepares a request to send / receive data, for example:

void Send_Master_to_Slave(sleve_address, int){ 
Send_Master_to_Slave(sleve_address, @int, sizeof(int));
}

and gives the full task to the driver twi.c.

The twi.c driver configures the TWI Unit as needed and starts the session on its own. Sending, resolving collisions, setting the busy flag and terminating the communication session occurs exclusively in the ISR (TWI_vect);

This eliminates the need to implement a timeout in the twi.c driver. Even if the TWI Unit freezes, then it will not come from the interrupt request. It just won't clear the BUSY flag. This will alert the user that something went wrong. and he himself can decide what to do.

If there were no freezes and the communication session ended. The twi.c driver should analyze the success of the data transfer. And in the case, for example, if not all data has been transferred. Record the error code. Which the user can get through get_Wire_error ();

As a result, we get the following: The user, after sending the data, can do something else. Later he can check the result of sending. Bus errors do not stop the MCU. The limitation on the amount of information transmitted at a time is removed. Buffers do not take up memory. The user can send a string to the screen or receive data from the sensor with one command.

bperrybap commented 3 years ago

@Aleev2007 There are many ways to design an i2c library and different API that has the potential to be better than the existing Wire library if starting from scratch. Unfortunately, the Wire library does not have that luxury since the Wire API is well established and exists across many platforms, including many 3rd party platforms. Backward compatibility with the existing API to support existing applications that use the Wire library is very important.

IMO, the only way to move to an i2c library API that is semantically different than the existing API, would be to create an entirely new library where there is complete freedom to design it from scratch and no restrictions of backward compatibility. Lots of things could be done differently, including handling multiple buses, and potentially asynchronous i/o that occurs in the background.

A new ic2 library with a new API is very different from making an incremental change to the current Wire library to try to make the existing Wire library a bit better, which is what this issue and the corresponding loop timeout code is about.

Aleev2007 commented 3 years ago

@bperrybap

Unfortunately, the Wire library does not have that luxury since the Wire API is well established and exists across many platforms, including many 3rd party platforms. Backward compatibility with the existing API to support existing applications that use the Wire library is very important.

That is why I urge you to stop before it's too late. All you have to do is put on the fuse. And in the event of a short circuit, it should work. And it's all! Choose a sufficient timeout that will satisfy 99% of users and hard code it. Let it be 1 second. Then the cycle is interrupted and the error flag is set. And close this topic for good. Otherwise, you will declare after 10 versions that you need backward compatibility to this useless new functionality. Which in 99% of cases only slows down the system. The remaining 1% of those who need to change the size of the timeout, moreover in engineering units, can use alternative versions or simply fix the constant in the twi.c. file.

Aleev2007 commented 3 years ago

Anyone trying to use the I2C protocol to poll a bunch of sensors scattered throughout the room. And people's lives depend on the performance of the device. I say - Stop it! For such things, the CAN bus was invented long ago. I2C works well only within a board, which is closed in an iron box.

May the force be with you. ;))

greyltc commented 3 years ago

@bperrybap Thanks for doing some performance comparisons! Could you try this patch to see if it improves the performance of your 1.8.13(no timeout) test case?

ermtl commented 3 years ago

@greyltc : In addition to the conditional call to micros() in your patch, another easy way to speed things up would be to only initialize startMicros once so that the timeout delay is used for the whole twi_readFrom function and not reinitialized for each of the 3 places where a timeout could occur. This way, 2 calls to micros() are removed enhancing both speed and code size. It would also guarantee that the function exits after the timeout delay and not something between 1x and 3x that delay.

greyltc commented 3 years ago

@ermtl great point. Your suggestion is now incorporated into the above patch for both twi_readFrom and twi_writeTo via https://github.com/greyltc/ArduinoCore-avr/commit/588135c227cae9771170472918c66c94f5505775

Aleev2007 commented 3 years ago

@greyltc

I am Russian, and therefore it is difficult for me to convey an important message to you. Everything you are doing now is making a big mistake! Everything worked fine except for the I2C speed of 0 kbit / s. Stop writing code. The user does not need it. You just need to write one line in three places. It will prohibit the speed 0 kbit / s. Just do it. Good luck.

Aleev2007 commented 3 years ago

@greyltc If you think I'm an idiot. so be it. I'll give you a quick fix. For better library performance.

in two places you use a construction like this:

  // activate internal pullups for twi.
  digitalWrite (SDA, 1);
  digitalWrite (SCL, 1);

  // deactivate internal pullups for twi.
  digitalWrite (SDA, 0);
  digitalWrite (SCL, 0);

It doesn't make sense as these pins cannot be PWM. Directly set the state of the conclusions.

// declare
volatile uint8_t *out;
// activate internal pullups for twi.
out = portOutputRegister (digitalPinToPort (SDA)) 
*out |= (digitalPinToBitMask (SDA) | digitalPinToBitMask (SCL));

// declare
volatile uint8_t *out;
// deactivate internal pullups for twi.
out = portOutputRegister (digitalPinToPort (SDA)) 
*out &= (~digitalPinToBitMask (SDA) & ~digitalPinToBitMask (SCL));
bperrybap commented 3 years ago

@greyltc yes, that patch improves things, but only a slight amount (~ 3%) and then there is a similar penalty when timeouts are enabled. using 32 bit values hurt on an 8 bit AVR. IMO, that patch is not a good way to try to restore performance as it is not an overall improvement since it hurts the i/o timing even more than what it was before when timeouts are enabled.

What I mentioned before is that the code be implemented in a way to quickly select between multiple loops to be able to provide similar timing when timeouts were disabled. That would be something like this, where twi_timeout_enabled is a 8 bit flag that gets set when timeouts are enabled. The twi_timeout_enable flag is NOT the actual timeout value.

  // wait until twi is ready, become master receiver
  if (twi_timeout_enabled)
  {
    startMicros = micros();
    while(TWI_READY != twi_state)
    {
      if(((micros() - startMicros) > twi_timeout_us))
      {
        twi_handleTimeout(twi_do_reset_on_timeout);
        return 0;
      }
    }
  }
  else
  {
    while(TWI_READY != twi_state) {}
  }

That said, I would not jump on this as the solution as I think the entire timeout stuff needs more thought and analysis as I think the overall goal should be to come up with a way to avoid the loop lockups without hurting i/o throughput too much - which is the overall comment from @Aleev2007

I haven't looked that closely at the code and the specific i2c state for each state wait loop but I'm guessing that it may be possible that not all the timeouts should be or need to be the same. I'm also not sure that it would be a good idea to change to the timeout being for the entire transaction as @ermtl suggested vs doing a timeout on the specific state transition as that introduces potential timeouts when there is no real issue on the bus if the transaction is running slow due to multiple masters, or a slave doing clock stretching.

There is definitely an art to doing performance optimization. I've been involved with embedded code projects for 40+ years. Projects from fault tolerant / must always run systems, to ones in products that were deployed by the 10's of millions. The most important thing to always keep in mind when optimizing is to not optimize the code but the system. What I mean by that is that when doing optimization you need to optimize the entire system, not a tiny piece of code. And to understand how to optimize the system, the system must be fully understood and then profiled. Without some amount of profiling, there is no way to really know where the overhead really is so trying to do anything is just trial and error guess work - which I am really opposed to. The type of optimization is does not matter. i.e. you can optimizing for speed, code size, or ram usage. In some cases it may require doing things differently vs trying to speed up what is currently being done.

I think the best comment so far was your comment

This I2C library could probably benefit from a rewrite from scratch

This library is definitely a bit like a Frankenstein. Lots of odd parts bolted together and some don't fit together very well. Things like some of the double buffering between Wire.cpp and twi.c to name one area.

However, I do think that some improvements could be made

For the loop timeout code, if I were doing it, I'd definitely be striving to see if there was a way to avoid 32 bit values for an AVR implementation. i.e. is 1 us timeout resolution really necessary? Could 1ms resolution be good enough? (micros() has more overhead than millis()) Could 255ms be enough for a state change loop timeout? As @Aleev2007 suggested, do users really need to configure the timeout?

For timeouts using ms there are games that could be played using millis() and truncating the return value to just the lower 8 bits to allow 8 bit math and compares if 255ms were long enough. And for longer than 255ms a loop around that could be done.

IMO, Whatever the implementation choices are, before these type of code changes are ever committed and made available to users, the implementer should be fully profiling it to know any/all new issues, including code and ram usage changes and i/o performance should be measured with a test sketch, logic analyzer and/or scope to know the full implications of the changes to i/o performance.

bperrybap commented 3 years ago

Guys, This seems to have entered into a technical discussion about implementation. Would this better handled on the mailing list rather than through a git issue?


From a larger picture, all these fast and furious incremental code updates and suggestion for updates are, IMO, hacks. i.e. coding on the fly while sitting at a keyboard, with no overall design and not enough thought behind them. I am not in favor of using this type of methodology, particularly in a piece of code that is bundled with the IDE and used by lots of users and 3rd party libraries.


@Aleev2007

I2C speed of 0 kbit / s ?

I have no idea what you describing.

In terms of inlining the i/o register updates vs using digitalWrite() Why the desire for the change? Particularly when what you are proposing can cause register corruption.

In this case, using digitalWrite() is better than what you have proposed. You need to understand that digitalWrite() ensures atomic register updates. The code you provided does not. Without atomic register updates, this code could corrupt a register if the same register were used by and updated by another piece of code (say a library) that is modifying the same register in an ISR. Even that other code calls digitalWrite() in the ISR, your proposed code could still corrupt the register since it does not update the register atomically. The digitalWrite() calls in the twi code are only being used in non time critical portions of the code (initialization & error recovery) that are not in the real time path for i/o so it will have absolutely no effect on normal i/o performance so there is no need to optimize that for speed. Because of all this, it is better to stick with using the portable pin i/o routines that ensure that registers will not be corrupted.

ermtl commented 3 years ago

@greyltc : Further speedup could be obtained by adding a byte loop counter. At first, and while the loop counter has not reached 0, the while statement should execute nearly as fast as the old version without timeout. The value for the byte loop counter (here 48 @ 16MHZ, should be tweaked) must be high enough to catch most of the fast peripherals without slowdown as the counter does not reach 0 and the time consuming calculation is never done. If the delay is longer and the counter elapses, it falls back to the more precise but time consuming current timeout routine. This would create a lower limit of a few tens of microseconds to the minimum timeout value, this should not be a problem since even a short timeout should be much longer than this. If the programmed timeout is shorter than the loop counter time, the routine will exit after the first test once the counter reaches 0.

uint8_t cnt=3*clockCyclesPerMicrosecond();  // Speedup loop counter. Should be long enough for most devices to respond, but not overflow the 8 bits value on fast processors
while(TWI_READY != twi_state){
  if (cnt) {cnt--;} else {
    if((twi_timeout_us > 0ul) && ((micros() - startMicros) > twi_timeout_us)) {
      twi_handleTimeout(twi_do_reset_on_timeout);
      return 0;
    }
  }
}
Aleev2007 commented 3 years ago

@bperrybap

Even that other code calls digitalWrite() in the ISR, your proposed code could still corrupt the register since it does not update the register atomically.

You're right. It will be correct.

#include < util / atomic.h >

// declare
volatile uint8_t *out;
// activate internal pullups for twi.
ATOMIC_BLOCK (ATOMIC_FORCEON){
  out = portOutputRegister (digitalPinToPort (SDA)); 
  *out |= (digitalPinToBitMask (SDA) | digitalPinToBitMask (SCL));
}

// declare
volatile uint8_t *out;
// deactivate internal pullups for twi.
ATOMIC_BLOCK (ATOMIC_FORCEON){
  out = portOutputRegister (digitalPinToPort (SDA));
  *out &= (~digitalPinToBitMask (SDA) & ~digitalPinToBitMask (SCL));
}

I2C speed of 0 kbit / s ?

I have no idea what you describing.

I2C-bus specification and user manual 4.2.1 I2C/SMBus compliancy SMBus and I2C protocols are basically the same: A SMBus master is able to control I2C devices and vice versa at the protocol level. The SMBus clock is defined from 10 kHz to 100 kHz while I2C can be 0 Hz to 100 kHz, 0 Hz to 400 kHz, 0 Hz to 1 MHz and 0 Hz to 3.4 MHz, depending on the mode. This means that an I2C-bus running at less than 10 kHz is not SMBus compliant since the SMBus devices may time-out. NXP Semiconductors UM10204 User manual Rev. 6 — 4 April 2014 page 33 of 64 link

I suggest solving the problem in a similar way. Specify in the specification for the library the minimum baud rate of 1 Hz. And hard-code it without the ability to change by the user by means of the library.

Aleev2007 commented 3 years ago

@greyltc : Further speedup could be obtained by adding a byte loop counter. At first, and while the loop counter has not reached 0, the while statement should execute nearly as fast as the old version without timeout. The value for the byte loop counter (here 48 @ 16MHZ, should be tweaked) must be high enough to catch most of the fast peripherals without slowdown as the counter does not reach 0 and the time consuming calculation is never done. If the delay is longer and the counter elapses, it falls back to the more precise but time consuming current timeout routine. This would create a lower limit of a few tens of microseconds to the minimum timeout value, this should not be a problem since even a short timeout should be much longer than this. If the programmed timeout is shorter than the loop counter time, the routine will exit after the first test once the counter reaches 0.

uint8_t cnt=3*clockCyclesPerMicrosecond();  // Speedup loop counter. Should be long enough for most devices to respond, but not overflow the 8 bits value on fast processors
while(TWI_READY != twi_state){
  if (cnt) {cnt--;} else {
    if((twi_timeout_us > 0ul) && ((micros() - startMicros) > twi_timeout_us)) {
      twi_handleTimeout(twi_do_reset_on_timeout);
      return 0;
    }
  }
}

Obviously, this (TWI_READY != twi_state) check is done to make sure the second device is ready to receive / transmit. If the device is not ready, then it holds the line, and the loop body starts executing. If the loop body takes much longer than the time it took to prepare the second device, it will slow down the transfer rate.

Aleev2007 commented 3 years ago

It seems that not everyone understands what. To reach 400kHz speed. The driver must have time to process the incoming or prepare the outgoing byte in 40 clock cycles (16 MHz / 400 kHz = 40). This also includes the time to check the timeout. In the latest driver implementation, with blackjack and whores, this is far from the case. :))

greyltc commented 3 years ago

I think further convoluting this issue with discussion of the throughput performance of the I2C timeout implementation we've used is not such a great idea. I understand it's only still open because there's still some documentation left to do (which of course is the boring part, so that lingers).

If anyone has a specific I2C performance problem that they'd like to raise, it seems best to file a new issue to cover that. Feel free to tag me there, especially if it's related to the newish timeout stuff and I'll do my best to chime in.

If anyone has a specific idea for making an improvement to the library, you could submit that as a PR and we can discuss that specific change there. Though in making your PR, consider that API changes need to present some pretty compelling evidence to have a chance at being merged, especially backwards incompatible ones (and rightly so). Maybe less formal ideas for improvement could be hashed out on the forums or something before they're ready for a PR?

I definitely think some of the specific performance improvement ideas that we've thrown around here show promise and I'll try to find some time to do some apples to apples oscilloscope informed benchmarking on them myself and submit them as a PR if I think the numbers I see warrant that. I'd tag a few of you on such a PR so we could review that together.

Aleev2007 commented 3 years ago

@bperrybap You can test the speed of this https://github.com/Aleev2007/twi no micros(), no uint32_t, fully compatible with the current version.

bperrybap commented 3 years ago

@Aleev2007 The approach does appear to offer faster state transition detection but it still would need some updates to be fully compatible. While the code may be API compatible, it is not functionally compatible. The timeout period will vary depending on the CPU clock speed and likely depending on how the optimizer optimizes the counter variables for the counter decrements and goto operations given the 8 bit counters may or may not be in a register. The CPU clock speed affecting the timeout could be resolved with some compile time conditionals and macros to modify the delay & counter values depending the value of F_CPU. It could be a simple as just modifying the timeout value down in twi_setTimeoutInMicros() to scale the timeout value used based on F_CPU to set the counter values to offer the requested timeout consistently. I would think it would be simpler and easier to maintain if it were all handled in twi_setTimeoutInMicros() That way begin() in Wire.cpp would call twi_setTimeoutInMicros() to set the default timeout and then there is no need to pre-calculate 8 bit counter values. However it is done, I think it is important that the API and the underlying code ensure that the timeout for a given timeout value passed to twi_setTimeoutInMicros() not be affected by the CPU clock speed. i.e. if 25000 is passed to twi_setTimeoutInMicros() the timeout is the same regardless of CPU clock speed.

The way to work around the potential optimization issues, would be to declare the 8 bit counters volatile, so that consistent code is always generated.

Also there is still a 32 bit comparison in the runtime loop of when timeouts are enabled. 32 bit comparisons are expensive on the AVR. i.e. this line is doing a 32 bit comparison: if (twi_timeout_us == 0) goto check;

To speed this up an additional 8 bit flag could be used to indicate that timeouts are enabled (set to true if twi_timeout_us is non zero. or since twi_timeout_us is not used for anything other than this test after twi_setTimeoutInMicros() just get rid of it and have simple 8 bit flag to indicate that timeouts are set. It would be set if twi_setTimeoutInMicros() is passed a non zero timeout. That flag could be used in the loop and tested to see if the 3 counters need to be loaded/initialized.

Aleev2007 commented 3 years ago

@bperrybap You are absolutely right. I know that you need to add the twi_setTimeoutInMicros () function. And at the expense of the line if (twi_timeout_us == 0) goto check; it is probably not needed at all, even if there is a flag. I don't see any scenarios where you need to turn off the timeout. You can just make it even bigger. And without this line it will work faster. two "if" operators to disable three "=" operators I need to look at the generated code to determine which is better. I'll try to do it tomorrow.

Aleev2007 commented 3 years ago

@bperrybap I think you can try to measure the speed. https://github.com/Aleev2007/twi

timeout switch now uses only 10 MSU ticks ;-)

if (!twi_timeout_off_flag){                                                           |ticks
 19c:   80 91 02 01     lds r24, 0x0102 ; 0x800102 <twi_timeout_off_flag>         | 2
 1a0:   81 11           cpse    r24, r1                                                   | 1  timeout off | 2 timeout on
 1a2:   09 c0           rjmp    .+18        ; 0x1b6 <twi_stop+0x2c>                   | 2  goto check; | 0 timeout on     [off +5] 
    counter_1 = set_1;                                                                    |
 1a4:   80 91 b7 01     lds r24, 0x01B7 ; 0x8001b7 <set_1>                        | 2                     
 1a8:   89 83           std Y+1, r24    ; 0x01                                    | 2 
    counter_2 = set_2;                                                                    |
 1aa:   80 91 01 01     lds r24, 0x0101 ; 0x800101 <set_2>                        | 2
 1ae:   8a 83           std Y+2, r24    ; 0x02                                    | 2
    counter_3 = set_3;                                                                    |
 1b0:   80 91 00 01     lds r24, 0x0100 ; 0x800100 <__data_start>                 | 2
 1b4:   8b 83           std Y+3, r24    ; 0x03                                    | 2                     [on +16] 
  }                                                                                       | 
check:                                                                                    |                                        
  if (!(TWCR & _BV(TWSTO))) goto go;                                                      | 
 1b6:   80 91 bc 00     lds r24, 0x00BC ; 0x8000bc <__DATA_REGION_ORIGIN__+0x5c>  | 2
 1ba:   84 ff           sbrs    r24, 4                                                    | 1  goto go;    | 2  wait
 1bc:   1c c0           rjmp    .+56        ; 0x1f6 <twi_stop+0x6c>                   | 2  goto go;    
  if (twi_timeout_off_flag) goto check;                                                   |
 1be:   80 91 02 01     lds r24, 0x0102 ; 0x800102 <twi_timeout_off_flag>         | 2
 1c2:   81 11           cpse    r24, r1                                                   | 1  timeout off | 2 timeout on
 1c4:   f8 cf           rjmp    .-16        ; 0x1b6 <twi_stop+0x2c>                   | 2  goto check; | 0 timeout on     [off +10]  
  if (!(--counter_1)) goto check;                                                         |
 1c6:   89 81           ldd r24, Y+1    ; 0x01                                    | 2
 1c8:   81 50           subi    r24, 0x01   ; 1                                       | 1
 1ca:   89 83           std Y+1, r24    ; 0x01                                    | 2
 1cc:   88 23           and r24, r24                                                  | 1
 1ce:   99 f3           breq    .-26        ; 0x1b6 <twi_stop+0x2c>                   | 2  goto check; | 1 counter = 0    [on +28] [1 loop 16 ticks]
  if (!(--counter_2)) goto check;                                                         | 
 1d0:   8a 81           ldd r24, Y+2    ; 0x02                                    | 2
 1d2:   81 50           subi    r24, 0x01   ; 1                                       | 1 
 1d4:   8a 83           std Y+2, r24    ; 0x02                                    | 2
 1d6:   88 23           and r24, r24                                                  | 1
 1d8:   71 f3           breq    .-36        ; 0x1b6 <twi_stop+0x2c>                   | 2  goto check; | 1 counter = 0      
  if (!(--counter_3)) goto check;                                                         |
 1da:   8b 81           ldd r24, Y+3    ; 0x03                                    | 2
 1dc:   81 50           subi    r24, 0x01   ; 1                                       | 1
 1de:   8b 83           std Y+3, r24    ; 0x03                                    | 2
 1e0:   88 23           and r24, r24                                                  | 1
 1e2:   49 f3           breq    .-46        ; 0x1b6 <twi_stop+0x2c>                   | 2  goto check; | 1 timeout                 
bperrybap commented 3 years ago

Coming back to the orignal topic of eliminating the potential of a loop lockup in the Wire library,

Having looked at the 1.8.13 "timeout" code now in a bit more detail, IMO, I think the current code was put together a bit too hurried and didn't get enough design input/review/thought or testing by technical users that understand things at this level as well as API portability issues before it was released in the 1.8.13 IDE. As an example, one big issue was that as the code was released in 1.8.13, it extended the WIRE API in a non detectable way, with timeouts disabled, which means libraries/sketches that wanted timeouts had to hard code things for this specific library and would then break when used with the older versions of the Wire library, including on the very same platform.

I think it would be useful for everyone to step back and think about the loop lockup issue and look at the lockup/timeout problem with a fresh set of eyes.

IMO, the current implementation is a bit too heavy so it impacts i/o performance a bit too much. The other thing I think that is worth REALLY thinking about is how to actually handle the loop timeouts. IMO, 1 us timeout resolution is not needed and is total overkill. Can 1 us even accurately ever truly be supported, particularly on slower processors? I think for a configurable timeout, 1ms would be more than good enough. This "timeout" is a failsafe to avoid a lockup loop in the code. It is something that normally should not ever happen. The length of the timeout in no way affects the i/o performance, so it isn't something that needs to be tuned. However, how the loop timeout code is implemented does affect i/o performance. In fact how much configurability of the timeout is really needed? There is no spec for how long to wait for these events, but there are certain minimums that are required for some of these, like when multiple masters are used, that are currently not enforced. How will a typical Arduino user ever know what to pick? They won't. The vast majority will just run with whatever the default timeout is. If at some point a timeout is enabled by default, the vast majority of users will never even mess with setting timeout values.

Would a simple hard coded non configurable timeout, on the order of say 1/2 or 1 second, be good enough? IMO, it likely would be good enough.

Since the WIRE library and WIRE API is somewhat a universal API implemented across nearly all platforms, running on processors at various speeds with various resources, the timeout functionality and any new WIRE API functions should carefully consider their impact on other platforms or slower processors. For example, what about a slower processor or a software Wire implementation? Those will likely never be able to support accurate 1us timeout resolution. Also, other platforms may not have the loop lockup situation like the AVR WIRE library did.

In other words would it be better in the grand scheme of things to have a simpler implementation that could be adopted more widely? vs trying to make something that has such granular configurability?

I bring this up because if it becomes desirable to make some significant changes to how loop timeouts are handled to simplify things or offer better performance, now would be the time to do it, before the existing 1.8.13 "timeout" code and API starts to get documented and gets too many dependencies.


Anybody that is working on this type of code and making these types of changes should be capable of testing their stuff themselves first before they make it available to others. Like said earlier I absolute HATE trial & error fly by the seat of the pants keyboard hacking and, IMO, that is what I see going on.

@Aleev2007 You have made significant changes to the low level twi.c code in places that are not related to this loop timeout. Also, nearly all the comments have been stripped out. The code is now broken.

There is no need to be making these types of changes, particularly when 1) they are in places that will not affect the runtime i/o performance, and 2) breaks things. The latest code I saw at 2020-01-12 13:00 CST commit ( 80cf430 ) on your github repository was not stable enough to run performance numbers. After calling Wire.setWireTimeout() to set the default timeout, every read and write times out. If you call Wire.setWireTimeout(0) after that to disable the timeouts, the code locks up on the next attempt to do a write

The twi.c code also has several warnings in it and I believe that one is indicating a calculation error due to a math/expression precedence order.

ermtl commented 3 years ago

@bperrybap Even if you have no use for a timeout faster than 500ms or 1 second, it's still useful for others. Most sensors respond almost immediately, and a few hundred microsecond delay means it will timeout. Knowing about it ASAP can be important (to stop a motor, shut off a pneumatic valve, etc ...) and one second later could be too late. Also, some applications require compatibility with the SMBus specification (25msec). However, as you mention, a 1us granularity is not really useful. If a bigger granularity is used, the stored timeout value could be 16 bits, that's much faster to test and uses 2 RAM bytes instead of 4. With the fast test, the need for a separate boolean value also goes away, freeing another RAM byte. As an example, if the granularity is 64uS, a 16 bit value would allow a timeout between 64us and 4.19 seconds + unlimited (no timeout) if the value is 0. The changes proposed by @greyltc and myself practically remove the slowdown for fast responding devices without requiring extensive rewrite of the existing code. Adding this timeout feature have been so laborious and took so long, doing it all over again and changing the API again instead of fixing the speed issue does not seem being the best way forward.

bperrybap commented 3 years ago

For me personally, none of this really matters as I don't have anything that requires and demands maximum i/o throughput or the ability to set very short state change timeouts.

I do think of utmost critical importance is that there needs to be some kind of macro that user code can test at compile time so the user code call tell if this new loop abort/timeout capability exists, especially if it is not enabled by default which means the user code do something to enable the timeouts. The WIRE library code in 1.8.13 failed to do this.

Whenever you guys have something that works, I'll be happy to try it.

In terms of performance, as of right now, from what I've seen, the approach that @Aleev2007 has gone down or something similar to it (i.e. not using micros()), can reduce state change latency and/or state change loop timeouts, especially when loop timeouts are enabled, more than an approach that uses micros() It also demonstrates than it is still possible to be compatible with the current 1.8.13 setWireTimeout() API without using micros() which is expensive due to all the 32 bit values and runtime math involved inside the micros() function.

Aleev2007 commented 3 years ago

@bperrybap

I probably got tired yesterday, from many attempts, to force the compiler to make code in which I can calculate the delay in the loop. And that's why it broke the logic of twi .c work. Sorry. :)

I decided to go over from the beginning and have now made minimal changes to the old version 1.8.12 of the library. I don’t have the test equipment right now, but everything is going without errors. https://github.com/Aleev2007/twi If it's not difficult for you, I would like to know how much slower it has become in comparison with the unmodified version 1.8.12. If it's acceptable. So I have ideas on how to make it compatible with 1.8.13 and fast.

And I haven't looked at ISR yet, I also have ideas about it. Last time I was mistaken in saying that the TWI module spends 40 cycles to transfer one byte. I forgot that the connection is serial and needs to be multiplied by 9. So it is quite possible to make everything fly even at 400 Hz.

bperrybap commented 3 years ago

@Aleev2007 The code is MUCH slower. (obviously broken) 1.8.12 is 77us writes, 118us reads this code is 189,279us writes and 126,041us reads After doing reads, writes change to 63,028us Obviously the code is not working as intended.


There is too much keyboard hacking going on for this project by multiple parties. I believe in a traditional methodical design & development process. I'm done testing code that isn't handled that way.

Wuntenn commented 3 years ago

I had the same error, I'm a newb too and haven't used cpp for a while.

I'm the kinda newb that hits every bug and had a similar error when I used interrupts a few weeks ago. I found out somewhere that if you place a lot of code in the vector code blocks that it can cause the code to hang.

I don't remember where I came across the advice but the recommended approach was to set a flag only in the vector and to use the loop to poll for the flag and carry out the necessary changes.

Reading through the datasheet for the UNO, it says that the TWI is an interrupt based system. I used this approach and it overcame my issue. Thought it might be useful for someone here! 🤷‍♂️

ghost commented 3 years ago

How about int16 for the timeout, default +25ms (and enabled, use millis()), use < 0 for advanced use i.e. micros() resolution? The implementation has skewed a bit away from being 'Arduino', more niche. Keep it simple and consistent for the beginners.

mattborja commented 2 years ago

I just need a way to skip/abandon a line of code hanging after a short period of time...

I honestly don't have this fully spec-ed out yet, but maybe we could provide the community with an idempotent interrupt implementation pattern to externally manage a hanging Wire.endTransmission(). It would involve the use of a LM555 timer to trigger up to 2 interrupts after a short period of time on startup.

The execution surrounding the interrupt(s) may include:


EDIT: The following also seems to work using https://github.com/rambo/I2C

#define DEV_ADDR 0x50
#define DEV_TIMEOUT_MS 3000

void WireCheck() {
  Serial.print("Checking wire status...");

  I2c.timeOut(DEV_TIMEOUT_MS);

  I2c._start();
  I2c._sendAddress(DEV_ADDR);
  byte wireStatus = I2c._stop();

  switch (wireStatus) {
    case 0:
      Serial.println("WIRE_OK");
      return;

    case 1:
      Serial.println("WIRE_TIMEOUT");
      return;

    case 0xff:
      Serial.println("WIRE_ERR");
      return;
  }
}

Might have a bug in here, but it is at least timing something out...

matthijskooijman commented 2 years ago

@mattborja, this timeout feature has actually been implemented (two years ago) in arduino/Arduino#107, but it is not enabled by default. Did you try turning it on using https://www.arduino.cc/reference/en/language/functions/communication/wire/setwiretimeout/ ?

Since we seem to have forgotten closing this issue, I'll do that now. For future reference, arduino/Arduino#107 implemented the feature, arduino/Arduino#362 is still pending to add feature detection macros like WIRE_HAS_TIMEOUT, and arduino/Arduino#363 is a draft to enable timeouts by default in the future.

mattborja commented 2 years ago

@matthijskooijman I hadn't come across that yet, but will give it a shot. I do prefer native libraries whenever possible.

Thanks!

mattborja commented 2 years ago

@matthijskooijman Yeah, that seems to be working great.

Aside: Extra "s" in heading Parameters here: https://www.arduino.cc/reference/en/language/functions/communication/wire/endtransmission/#_parameterss

Thanks again!

per1234 commented 2 years ago

Extra "s" in heading Parameters here:

Thanks for reporting that @mattborja. I see the "syntaxx" section heading suffers from the same error.

Please submit a correction via a pull request here: https://github.com/arduino/reference-en/blob/master/Language/Functions/Communication/Wire/endTransmission.adoc

mattborja commented 2 years ago

@per1234 https://github.com/arduino/reference-en/pull/874