Closed david314 closed 3 years ago
Can you please change the third parameter from false
to true
:
https://github.com/rogerclarkmelbourne/Arduino_STM32/blob/7a7659d1116c11422a3025713cd5c208da991c86/STM32F1/libraries/Wire/utility/WireBase.h#L107
Then check again.
I can reproduce the problem, there is no valid stop issued on I2C bus. I need more time to find the root cause, I need to check the whole code to realize what is going on there. Meanwhile if anyone else wants to do it more effectively, feel free to do it.
Just to be sure, is this with using the Wire
library or the WireSlave
library or both? Of the two, WireSlave
is the primary one I was working with and is hardware I2C only. There's some shared code that the Wire
library uses from it for the hardware I2C parts, with the key difference being that WireSlave
supports both master and slave modes, and Wire
only supports master mode.
If you guys were testing with the Wire
library, I'm curious if you are seeing the same issue if you switch to the WireSlave
library?
Also, in the waveforms presented, they look to be the same with the exception of the exact bit-timings. One is clearly soft-I2C based, since the waveform toggles right at the middle of the clock cycle. And the other is clearly hard-I2C based, as the data toggles are happening almost concurrently with the clock edge. But otherwise, I don't see a substantial difference between them -- what exactly about the waveform are you saying is wrong??
If it's just hanging, most times that is due to missing (or wrong) pull-up resistors on those pins. What value pull-up resistors are you guys using??
Donna, you have adapted the i2c module to be compatible with your Wire slave lib, see https://github.com/rogerclarkmelbourne/Arduino_STM32/commit/d52422c1a90a65eeaecd039c001abc2782e3fa0c
Please pay attention to the missing stop condition after the red "1". With software I2C you can see a clean stop condition, while the HW interface fails to generate it correctly, the last blue rising edge should be delayed from the last red rising edge. Is it maybe due to a race condition somehow?
Yes, I know I added support for slave I2C, but slave mode is only supported in the WireSlave
library. Here, I think you are using master-mode and both Wire
and WireSlave
support master. So which library are you using? I ask, because I do port scanning for device presence on the WireSlave
library all the time and haven't had any issues with it hanging.
So if you were using Wire
can use you try WireSlave
and see if it behaves the same?
The two examples I posted earlier use both the Wire.h
library. I just tested the example i2c_scanner_wire.ino
with the Wire_slave.h
library. It features even more strange artifacts, I see a repeated start condition when the slave doesn't answer:
When it detects the first device at address 0x28 the I2C module makes a long pause after transmitting the address and announcing a write access:
The second device on the bus is found at 0x76 as expected with the following sequence:
This has nothing to do with wrong pull-ups, there is a mayor problem in the code.
I tested Wire.h. But as far as I can see, Wire_slave.h has the same problem, on the place where the stop condition should be the sda and scl line rising edges overlap.
Yes, the rising flank just before the second START is a "broken" STOP, it can bee seen in all above waveforms.
I don't mind digging into this deeper, as it needs to be resolved, but it might be a little while. With all of the rain and flooding this week, my lab is currently under water or more specifically I'm in the long and expensive process of cleaning it and assessing the damage.
In order to test the hardware, I modified the i2c_transfer function to a simplified version by polling the flags instead of using interrupts:
int32 i2c_master_xfer(i2c_dev *dev, i2c_msg *msg, uint32 timeout)
{
if (msg == NULL) return 0;
dev->error_flags = 0;
dev->msg = msg;
dev->msg->xferred = 0;
dev->timestamp = millis;
dev->state = I2C_STATE_BUSY;
dev->regs->SR1 = 0; // Reset error/status flags
dev->regs->CR1 = (I2C_CR1_PE | I2C_CR1_START);
__IO uint32_t sr1;
// wait for end of START
do {
sr1 = dev->regs->SR1; // read status register
} while ( !(sr1 & I2C_SR1_SB) ); // start bit
// send address
i2c_send_slave_addr(dev, dev->msg->addr, 0);
// wait for end of ADDR
do {
sr1 = dev->regs->SR1; // read status register
} while ( !(sr1 & (I2C_SR1_ADDR|I2C_SR1_AF)) ); // address ACK or NACK
// generate STOP
dev->regs->CR1 |= I2C_CR1_STOP;
if (sr1 & I2C_SR1_AF)
return 1; // NACK
else
return 0; // ACK
}
With this, I get correct STOP condition on the bus, and the scanner also recognizes the attached device.
So it seems to me that the the wire (slave) lib somehow does not handle correctly the NACK condition using interrupts. Next I will try to use interrupts for master mode only.
@stevstrong -- I updated PR #804 to move the location of the STOP
handling in master mode so that the CR1 register can get polled for completion and eliminate the race-condition between the irq handlers.
@dewhisna , I will continue to write here because this is the issue we are trying to solve.
While trying to get the same behaviour with IRQs, I realized the problem is not in transfer function, but the layer above. If I change my simple version (see above) to return I2C_ERROR_PROTOCOL instead of "1" then the STOP is damaged again.
do {
sr1 = dev->regs->SR1; // read status register
} while ( !(sr1 & (I2C_SR1_ADDR|I2C_SR1_AF)) ); // ACK or NACK
dev->regs->CR1 = I2C_CR1_PE | I2C_CR1_STOP;
if (sr1 & I2C_SR1_AF)
return 1; // when changing to I2C_ERROR_PROTOCOL the STOP will be destroyed
else
return 0;
This lead me to check what happens in the process()
function in Wire.cpp
and I see there
https://github.com/rogerclarkmelbourne/Arduino_STM32/blob/179c0261cdd2414645104f8d57314e67d699451d/STM32F1/libraries/Wire/Wire.cpp#L43-L54
If you look to lines 52-53 you see that the I2C will be disabled and re-enabled again. This is what causes the STOP to fail.
If I comment out those lines the STOP will be generated correctly.
I did not try your suggestion yet.
Then I changed the process() function to this:
if (res == I2C_ERROR_PROTOCOL) {
if (sel_hard->error_flags & I2C_SR1_AF) { /* NACK */
res = (sel_hard->error_flags & I2C_SR1_ADDR ? ENACKADDR :
ENACKTRNS);
} else {
if (sel_hard->error_flags & I2C_SR1_OVR) { /* Over/Underrun */
res = EDATA;
} else { /* Bus or Arbitration error */
res = EOTHER;
}
i2c_disable(sel_hard);
i2c_master_enable(sel_hard, dev_flags, frequency);
}
}
and it started to work correctly.
So basically we should differentiate between NACK and other type or errors when considering to disable-enable the interface. Also, this should be brought to the same level where the STOP s generated in order to avoid parallel error handling causing similar issues.
@david314 , please test the new code and let me know if it fixes the issue.
@stevstrong -- the problem with your fix is that the reset is needed in some cases to get the bus back in a nominal state. Please try my updated #804 because it shouldn't be returning now from the middle code layer until after the STOP has finished being transmitted -- it's got a while
loop now that won't let it. Prior, it had issued the STOP but there was a race-condition present.
My fix still performs reset in case of HW error, but in case of NACK, which is not considerd a HW error. I will try your fix later.
Yes, I understand both that your change still does a reset on HW error and that NACK is technically not an error, but I've seen cases where a reset was still needed in the presence of certain misbehaving I2C devices. I don't remember all of the specifics and would have to dig through lots of old code and docs to find the details, but I think that's why it was that way in the code to begin with. And unfortunately I'm in the process of switching my main development PC here to a new machine (no thanks to Nvidia and their horrible nonsense with buggy drivers and lack of open-source support for the Linux community), so I'm somewhat limited in what I can do until that data migration process is complete.
Though I see you've already pushed your test code to master
, which you really shouldn't have done yet. You should have created another PR for it for others to test and comment on, especially since it was an issue others were working on testing and resolving, because now undoing it will either unnecessarily create commit clutter or require a force push that's going to make lots of people using the repo very unhappy. But, it's not my repo ... so...
I pushed the change because I have tested it and am convinced that it resolves this issue, namely the STOP generation for NACK. OK, I admit, I could have made first a PR for this, but this way was quicker. The scanner seems to be used very often so with the push I will probably avoid some further "scanner does not work" posts or bug reports.
Anyway, I am open to make further changes if needed to handle special error cases (different issue) if you or anyone else bring prove that is necessary.
I have been experiencing some strange behavior with my Arduino Maple design lately, to be more precise with the I2C bus. To make the issue repeatable I checked the two examples _i2c_scannerwire and _i2c_scannersoftwire. The latter works and detects the two devices connected to the bus while the other example hangs after the first access to the bus. It appears something happens during the issuing of the stop condition because both SDA and SCL are released almost simultaneously.
The following diagram shows the waves for the first access of the working example _i2c_scannersoftwire:
The stop condition is issued correctly and the program continues searching for devices on the bus. The buggy example _i2c_scannerwire shows an erroneous behavior with a wrong stop condition:
While the working example runs the loop, the buggy example hangs after the first bus access shown here. The different behavior of the rising clock edge on the first diagram is because of the higher frequency, both codes ran on the same system. Has anyone encountered the same problem?