sparkfun / SparkFun_u-blox_GNSS_Arduino_Library

An Arduino library which allows you to communicate seamlessly with the full range of u-blox GNSS modules
Other
228 stars 100 forks source link

Lib not working with ESP32 by using core 2.0.1 #77

Closed FStefanni closed 2 years ago

FStefanni commented 3 years ago

Subject of the issue

Hi,

by using the same code, the same lib version, the same hw, etc, but upgrading the ESP32 Arduino core library, from version 2.0.0 to version 2.0.1, it stops to work.

Initially I supposed it was a bug of the ESP32 core, but the core maintainers say it is an issue related on how many libs are using the I2C. So it seems to be an issue of this GNSS library.

The full report I have done and the discussion ongoing is reported here: https://github.com/espressif/arduino-esp32/issues/5875

Please give it a check.

Thank you, regards.

Your workbench

Steps to reproduce

Just use the Example3_GetPosition sketch.

Expected behavior

Should work.

Actual behavior

Prints the following:

u-blox GNSS not detected at default I2C address. Please check wiring. Freezing.
PaulZC commented 3 years ago

Hi Francesco (@FStefanni ),

Many thanks for making us aware of this.

Reading between the lines, and without having tested this on actual hardware, it looks like something fundamental has changed in the way the core supports I2C restarts. Or, more precisely, where a requestFrom follows a restart.

There are several places in the library where we use restarts ( endTransmission(false) ). E.g.:

https://github.com/sparkfun/SparkFun_u-blox_GNSS_Arduino_Library/blob/3fd4d2a109f663d4de254fd2cb2d1f8d489e8bfe/src/SparkFun_u-blox_GNSS_Arduino_Library.cpp#L716-L718

There were hints of this with an earlier version of the core. We had to add a parameter to pushRawData for ESP32 users so they could instruct that function to use a stop instead of a restart:

https://github.com/sparkfun/SparkFun_u-blox_GNSS_Arduino_Library/blob/3fd4d2a109f663d4de254fd2cb2d1f8d489e8bfe/src/SparkFun_u-blox_GNSS_Arduino_Library.cpp#L3713-L3715

https://github.com/sparkfun/SparkFun_u-blox_GNSS_Arduino_Library/blob/3fd4d2a109f663d4de254fd2cb2d1f8d489e8bfe/src/SparkFun_u-blox_GNSS_Arduino_Library.cpp#L3746

Ah, now then. I wonder if it is linked to the use of the stop parameter in requestFrom ? I see that in line 718 we use a restart, but in the requestFrom which follows, we don't use the stop parameter:

https://github.com/sparkfun/SparkFun_u-blox_GNSS_Arduino_Library/blob/3fd4d2a109f663d4de254fd2cb2d1f8d489e8bfe/src/SparkFun_u-blox_GNSS_Arduino_Library.cpp#L729

and again line 848:

https://github.com/sparkfun/SparkFun_u-blox_GNSS_Arduino_Library/blob/3fd4d2a109f663d4de254fd2cb2d1f8d489e8bfe/src/SparkFun_u-blox_GNSS_Arduino_Library.cpp#L848

Very interesting...

OK. So there are two things we need to test / try:

Thanks!

Best wishes, Paul

PaulZC commented 3 years ago

OK. It looks like this is actually to do with I2C clock stretching, not I2C restarts.

I was seeing some very weird behavior with our u-blox library and v2.0.1 of the ESP32 core. In sendI2cCommand , the three beginTransmission + endTransmission code segments which use an I2C restart ("Do not release bus") were generating no I2C bus traffic. Only the final beginTransmission + endTransmission code segment which does not use an I2C restart ("All done transmitting bytes. Release bus.") was generating I2C bus traffic.

I tried making several changes to the restarts - including getting rid of them completely - and could not get the code to work.

But we know the u-blox module uses clock stretching - and I could see that on the analyzer. So, to rule that out as a possible cause, I tested a much simpler set-up using the SparkFun Qwiic Button. The Qwiic Button is based on an ATtiny - and that uses I2C clock stretching too.

I'm running Example1 from the Qwiic Button Library on a SparkFun ESP32 Thing Plus. The Qwiic Button is connected via a Qwiic cable. I'm monitoring the I2C bus traffic with a simple logic analyzer and PulseView.

The Library does not use restarts, just nice simple beginTransmission + endTransmission() followed by requestFrom :

https://github.com/sparkfun/SparkFun_Qwiic_Button_Arduino_Library/blob/15b5aa8c1a96940d89d2db9487cdaacab4e364ca/src/SparkFun_Qwiic_Button.cpp#L291-L297

With v2.0.0 of the core, the example works as it should and here is what I see on the I2C bus. Note - lots of clock stretching on the D0 trace:

image

If I go to v2.0.1 of the core, the example fails. Note the very long strange interval where the D0 clock signal is high, prior to the actual stop (P):

image

So, while I agree that the I2C code in the u-blox library does need a refresh, it looks to me like clock stretching is the key issue here.

Best wishes, Paul

FStefanni commented 3 years ago

Hi Paul,

thank you for your effort in analyzing the issue. I am not an expert in this field, so I do not fully understand what is truly happening... sorry. I'll wait to see how the misbehavior will be fixed and where (the core? the lib? both?).

Regards

me-no-dev commented 3 years ago

@PaulZC is there any chance that you test using ESP32-S2 board?

PaulZC commented 3 years ago

Hi @me-no-dev ,

I have an ESP32-S2 Thing Plus arriving tomorrow. I'll test both this u-blox library and the Qwiic Button as soon it arrives.

It looks like there are two separate issues. This u-blox library uses I2C restarts (repeated starts) in several places and, right now, those are broken with v2.0.1. Then there appears to be a separate issue with devices which clock-stretch, like the Qwiic Button (and the u-blox modules).

When we're reading data from the u-blox module, we want to guarantee that the operation is not interrupted. We do not want to release the bus part way through the transfer.

We want to be able to do this:

beginTransmission(address)
write(data1)
endTransmission(false)

beginTransmission(address)
write(data2)
endTransmission(false)

beginTransmission(address)
write(data3)
endTransmission(true)

but with v2.0.1 of the core, only data3 is actually transmitted on the bus. data1 and data2 are 'lost'.

We also want to be able to do this:

beginTransmission(address)
write(data1)
endTransmission(false)

requestFrom(address, length, false)

beginTransmission(address)
write(data2)
endTransmission(false)

requestFrom(address, length, false)
requestFrom(address, length, false)
requestFrom(address, length, true)

But I am going to keep looking at whether the restarts are truly essential. Right now I believe they are - and we've been using them that way for a long time on many platforms.

Best wishes, Paul

me-no-dev commented 3 years ago

Why do you need to do repeated writes? Could you not instead do:

beginTransmission(address)
write(data1)
write(data2)
write(data3)
endTransmission(true)

And

beginTransmission(address)
write(data1)
endTransmission(false)
requestFrom(address, length, true)

beginTransmission(address)
write(data2)
endTransmission(false)
requestFrom(address, length*3, true)

From I2C point, this is the same thing. TBH I've never seen such restarts before anywhere. If restart is necessary, then it is always followed by requestFrom

PaulZC commented 3 years ago

Hi @me-no-dev ,

Sorry. My pseudo-code was a bit too simplistic.

With the u-blox modules, we are often reading and writing kBytes of data over I2C. We have to split the reads and writes up into 'chunks'. The chunk size has to match the size of the I2C buffer on whatever platform the code is being run on. Ideally, we want to do those multiple reads - and writes - without releasing the bus in between. I2C is a multi-controller / multi-target (multi-peripheral) bus.

Yes, OK, in most 'Arduino' systems, there would only ever be a single controller. But multi-controllers are supported in I2C. If you are performing an operation like a large data transfer, you need to prevent another controller from obtaining control of the bus in between your reads/writes. And you do that using the restart.

If you need some bedtime reading, please have a look at:

https://www.i2c-bus.org/specification/

https://www.nxp.com/docs/en/application-note/AN10216.pdf

I appreciate that this is a headache for threaded OS code. You need a way that an individual beginTransmission + endTransmission(false) can be executed, leaving the bus in the restart state ready for the next transmission. Likewise you need to be able to do multiple requestFrom's, leaving the bus in the restart state in between. Yes, that creates all kinds of headaches for threading. But if you want to support I2C correctly, that's what you need to implement.

In this particular case, for a single controller, maybe we can avoid needing to use additional restarts? But, right now, I still believe we need them.

(I'm old enough to remember when I2C was first introduced by Phillips, back in the early 1980's. I used their chips in some of my earliest projects.)

Very best wishes, Paul

me-no-dev commented 3 years ago

Hi @PaulZC the thing with ESP I2C peripheral is that it executes only once that it has a stop queued. That means that even if we implement all those repeated start situations that this driver needs, the resulting size limitations will be the same. In our Arduino Wire implementation, we have ensured that the instance is thread locked between the first start and the last stop and the I2C bus only executes on sending stop everything queued at once (peripheral limitation)

PaulZC commented 3 years ago

Hi @me-no-dev ,

OK. I've restructured the I2C code in this library so it can work with these restrictions. When writing data to the u-blox module, it now breaks the data up into uniform, contiguous chunks which match the chosen i2cTransactionSize. Each chunk is sent as a single transmission. The code still allows the endTransmission to use a restart if desired, but for ARDUINO_ARCH_ESP32 it defaults to using a stop at the end of each transmission.

The good news is that it is almost working. The bad news is that I'm now seeing residual issues on ESP32 caused - I think - by clock stretching.

Here's what I see using a SparkFun ESP32 Thing Plus (ESP32-WROOM-32D):

When the code wants to check how much data is waiting in the u-blox module's buffer, it does a two byte read from register address 0xFD. The module's (default) I2C address is 0x42. A good read looks like this:

image

Note the restart (Sr) between the write and the read, and the stop (P) at the end.

If the buffer contains data, the read looks like this:

image

But I'm seeing lots of errors (bad reads) too. This looks to me like a clock-stretching issue.

image

On a SparkFun ESP32-S2 Thing Plus (ESP32-S2-WROOM), the new code seems to be running perfectly!

I'm going to leave the ESP32-S2 running for a few hours and I will report back if I see any errors.

I will try the Qwiic Button too. More on that in a moment.

Best wishes, Paul

me-no-dev commented 3 years ago

Hi @PaulZC :) Those are great news. I had suspicion that newer chips will handle stretching properly and the issue is only with ESP32. I know the stretching used to work with the custom I2C driver that we had in Arduino before, so I will need to investigate inside ESP-IDF's driver and add support for stretching there. I've ordered a few devices already that do stretching so I will get on this as soon as I can.

PaulZC commented 3 years ago

Hi @me-no-dev ,

Going back to the Qwiic Button example:

On the SparkFun ESP32 Thing Plus (ESP32-WROOM-32D), the begin fails with similar stretching issues:

image

On the SparkFun ESP32-S2 Thing Plus (ESP32-S2-WROOM), the example seems to work perfectly - even with clock stretching (in the middle of the read):

image

Best wishes, Paul

me-no-dev commented 3 years ago

@PaulZC I've found the clock stretching issue. Please test the linked pull request if you can: https://github.com/espressif/arduino-esp32/pull/5910

PaulZC commented 3 years ago

Hello @me-no-dev ,

Yes, confirmed, your code changes appear to have fixed the clock stretching issue completely.

Here is what I see with the SparkFun ESP32 Thing Plus and a u-blox ZED-F9P GNSS module:

image

And here is the SparkFun ESP32 Thing Plus with the SparkFun Qwiic Button:

image

Thank you!

I will leave this issue open until you have released your changes (just to make is easier for our library users to find the fix).

Have a great weekend, Paul

PaulZC commented 3 years ago

Hi Francesco (@FStefanni ),

The fix is in two parts:

1) Please upgrade your u-blox GNSS library to v2.0.18 (released yesterday)

2) Navigate to the esp32 core directory and replace two files:

esp32-hal-i2c_v2.0.1_bugfix.zip

Please leave this issue open for now. I will close it once the changes are formally released,

Very best wishes, Paul

me-no-dev commented 3 years ago

@PaulZC it's already in master :)

PaulZC commented 3 years ago

@me-no-dev - yes indeed - thank you. But most of our users use the Arduino Boards Manager...

FStefanni commented 3 years ago

Hi,

@PaulZC thank you for the fix. Ok I'll leave this issue open.

Regards.

PaulZC commented 2 years ago

Hi Francesco (@FStefanni ),

This issue has been corrected in v2.0.2 of the ESP32 core - released last week.

Closing...

Best wishes, Paul

FStefanni commented 2 years ago

Hi,

this is a good piece of news.

Thanks, regards.