Closed KjartanOli closed 1 month ago
Hi, and sorry you're having trouble with this. I have to dash away so can't reply properly right now but one thing to get you moving: you don't need to call uPortI2cOpen()
; uDeviceOpen()
does all of that for you based on the gDeviceCfg
you have passed to it. uPortI2cOpen()
is likely returning -5 because that I2C port is already open. That shouldn't cause the problem you are seeing, will see if I can figure out what's happening with that later.
Looking at the code of
uGnssPosGet()
it appears to me to only returnU_ERROR_COMMON_NOT_INITIALISED
whengUGnssPrivateMutex == NULL
Indeed: when the error occurs does it persist, i.e. is the only route to recovery through resetting the MCU? Could you share the code where you call uGnssPosGet()
in a loop?
Yes the error persists until the MCU is rebooted. The loop looks like this:
for(std::size_t i = 0;; ++i) {
ESP_LOGI(TAG.data(), "Acquiring position");
radio.transmit(reinterpret_cast<std::uint8_t*>(Frame(id,
base,
Protocol::reserved,
"Acquiring position").serialise().data()),
Frame::serialised_size);
auto clock = stopwatch();
// auto pos = position{};
// auto status = ublox_get_pos(&pos.latitude, &pos.longitude, &pos.altitude, &pos.radius);
auto pos = ublox::get_position();
auto elapsed = clock.elapsed();
if (pos)
{
auto buffer = std::array<char, Frame::payload_size>{};
auto result = std::format_to_n(buffer.begin(), buffer.size(),
"Round {}:\nI am here: {:.3},{:.3}\nTime {}\n",
i,
((double) pos->latitude) / 10000000,
((double) pos->longitude) / 10000000,
elapsed
);
auto frame = Frame(id,
base,
Protocol::reserved,
std::span<const std::byte>(reinterpret_cast<const std::byte*>(buffer.data()),
result.size
)
);
std::format_to_n(buffer.begin(), buffer.size(),
"src: {}\ndest: {}\npayload: {}",
frame.source(),
frame.destination(),
std::string_view(reinterpret_cast<const char*>(frame.payload().data()), frame.payload().size())
);
ESP_LOGI(TAG.data(), "%s", buffer.data());
state = radio.transmit(reinterpret_cast<std::uint8_t*>(frame.serialise().data()), Frame::serialised_size);
if(state == RADIOLIB_ERR_NONE) {
// the packet was successfully transmitted
ESP_LOGI(TAG.data(), "success!");
} else {
ESP_LOGI(TAG.data(), "failed, code %d\n", state);
}
}
else
{
char buffer[200];
sprintf(buffer, "failed to get position: %ld", pos.error());
ESP_LOGI(TAG.data(), "%s", buffer);
state = radio.transmit(reinterpret_cast<std::uint8_t*>(Frame(id,
base,
Protocol::reserved,
buffer).serialise().data()),
Frame::serialised_size);
if(state == RADIOLIB_ERR_NONE) {
// the packet was successfully transmitted
ESP_LOGI(TAG.data(), "success!");
} else {
ESP_LOGI(TAG.data(), "failed, code %d\n", state);
}
fflush(stdout);
}
ESP_LOGI(TAG.data(), "sleeping");
std::this_thread::sleep_for(5s);
}
with the call to uGnssPosGet inside ublox::get_pos
:
namespace ublox
{
struct position
{
std::int32_t latitude;
std::int32_t longitude;
std::int32_t altitude;
std::int32_t radius;
}
auto get_position() -> std::expected<position, status_type>
{
auto pos = position{};
auto status = uGnssPosGet(gnssHandle, &pos.latitude, &pos.longitude, &pos.altitude, &pos.radius, NULL, NULL, NULL, NULL);
if (status != 0)
return std::unexpected{status};
return pos;
}
};
Everything after the call to ubxlib::get_pos
is concerned only with transmitting and logging the location so it shouldn't be relevant
for(std::size_t i = 0;; ++i) {
ESP_LOGI(TAG.data(), "Acquiring position");
auto pos = ublox::get_position();
if (pos)
{
auto buffer = std::array<char, Frame::payload_size>{};
auto result = std::format_to_n(buffer.begin(), buffer.size(),
"Round {}:\nI am here: {:.3},{:.3}\nTime {}\n",
i,
((double) pos->latitude) / 10000000,
((double) pos->longitude) / 10000000,
elapsed
);
ESP_LOGI(TAG.data(), "%s", buffer.data());
}
else
{
char buffer[200];
sprintf(buffer, "failed to get position: %ld", pos.error());
ESP_LOGI(TAG.data(), "%s", buffer);
}
ESP_LOGI(TAG.data(), "sleeping");
std::this_thread::sleep_for(5s);
}
This should be equivalent for this problem.
All seems straightforward enough. One thing you might try, if you haven't already, is leaving just the initialisation and the call to get_position()
in place but not calling any of the code that processes the results, see if the problem remains, just to "divide and conquer".
I've reduced the loop to:
for(std::size_t i = 0;; ++i) {
ESP_LOGI(TAG.data(), "Acquiring position");
auto pos = ublox::get_position();
if (!pos)
{
ESP_LOGI(TAG.data(), "failed to get position: %ld", pos.error());
}
else
{
ESP_LOGI(TAG.data(), "Got position");
}
ESP_LOGI(TAG.data(), "sleeping");
std::this_thread::sleep_for(5s);
}
I then did the following:
If you are using i2c, and just interrupt the transfer with unplugging power make the i2c bus is released when you restart next time, this can be done by resetting the gnss, or forcing few sck cycles at start.
Maybe you can measure with a logic analyser or read with goios the state of the i2c lines in the locked-up case.
@KjartanOli: your description is clear but, understanding why it does what it does, less so. There should be no difference between an MCU reset and resetting the MCU, IYSWIM. Zero-initialised, or statically initialized, memory should be set up in both of those scenarios, whether the MCU was reset via a reset pin or the MCU was started afresh from being powered off and on again.
The code does the same stuff either way.
The failure you see, which suggests that gUGnssPrivateMutex
is NULL
, would only be systematically introduced if GNSS were deinitialized, which there is no evidence of. That leaves us with something writing zeroes to memory in that area, which is equally unlikely.
Hmph.
One more question for my clarification: when the issue appears, how many times around the loop does it take for it to appear?
Something you might try: look in the .map
file for where the gUGnssPrivateMutex
is being placed in memory and see what is either side of it. Find those variables in the code and hack some checks in which print out their values, to see if they remain as expected while gUGnssPrivateMutex
becomes NULL
. That, or add some guard variables before and after gUGnssPrivateMutex
in the code, verify in the .map
file that they really are being placed either side of it in memory, then initialise those variables to known values and extern
them from your application so that you can check them.
As far as I can tell the issue appears immediately. The code either works every time or not at all.
Here is the section around gUGnssPrivateMutex
in the .map
file if it tells you anything. I'll try hacking something with the variables around it.
.bss.gDeviceI2c
0x3ffb2b44 0x50 esp-idf/ubxlib/libubxlib.a(u_device_private.c.obj)
.bss.gI2c 0x3ffb2b94 0x30 esp-idf/ubxlib/libubxlib.a(u_device_private.c.obj)
.bss.gHeapAllocCount
0x3ffb2bc4 0x4 esp-idf/ubxlib/libubxlib.a(u_port_heap.c.obj)
.bss.gUGnssPrivateMutex
0x3ffb2bc8 0x4 esp-idf/ubxlib/libubxlib.a(u_gnss_private.c.obj)
0x3ffb2bc8 gUGnssPrivateMutex
.bss.gpUGnssPrivateInstanceList
0x3ffb2bcc 0x4 esp-idf/ubxlib/libubxlib.a(u_gnss_private.c.obj)
0x3ffb2bcc gpUGnssPrivateInstanceList
.bss.gStdoutCounter
0x3ffb2bd0 0x4 esp-idf/ubxlib/libubxlib.a(u_port_debug.c.obj)
0x3ffb2bd0 gStdoutCounter
Not sure if this is a new development, or if I somehow just never noticed it before, but it looks like when the issue appears uDeviceOpen
returns -8
(U_ERROR_COMMON_PLATFORM
).
Ah, that would cause the error you are seeing and could relate to the issue that @mazgch suggested might be in play: if uDeviceOpen()
is unable to talk to the GNSS device at initialisation, e.g. because of an I2C bus problem, uGnssPosGet()
would indeed return U_ERROR_COMMON_NOT_INITIALISED
. A log of the debug output from ubxlib
would confirm this: logging should be on by default (compiled in if U_CFG_ENABLE_LOGGING is 1 and can be run-time switched off and on with uPortLogOff()/uPortLogOn(), will arrive on the normal ESP-IDF terminal output).
Do you have a spare MCU pin you could connect to the RESET_N
line of the GNSS chip and you could toggle low and then high again to get the I2C bus into the correct state at start of day?
if you detect a blocked I2C bus by reading in GPIO mode. either SDA or SCL is held low by a attached the peripherial, isc transfer returns a bus error.
One way to resolve a blocked I2C bus is to force a series of 10 clocked cycles in GPIO mode:
for (int i = 0; i < 10; i ++) { Set SCL input / resistor will pull SCL high Wait 50 us Set SCL output low Wait 50 us }
// Send Stop Set both SCL / SDA low Wait 100 us Set SCL / SDA input / resistor will pull SCL high
Restart normal i2c transations
Ah, that would cause the error you are seeing and could relate to the issue that @mazgch suggested might be in play: if
uDeviceOpen()
is unable to talk to the GNSS device at initialisation, e.g. because of an I2C bus problem,uGnssPosGet()
would indeed returnU_ERROR_COMMON_NOT_INITIALISED
. A log of the debug output fromubxlib
would confirm this: logging should be on by default (compiled in if U_CFG_ENABLE_LOGGING is 1 and can be run-time switched off and on with uPortLogOff()/uPortLogOn(), will arrive on the normal ESP-IDF terminal output).
Here is what I believe is the relevant part of the log.
I (397) main_task: Started on CPU0
I (407) main_task: Calling app_main()
I (407) ublox.cpp: Port initialised: 0
I (407) ublox.cpp: I2C Port initialised: 0
I (407) ublox.cpp: Device initialised: 0
U_GNSS: initialising with ENABLE_POWER pin not connected, transport type I2C.
U_GNSS: sent command b5 62 0a 06 00 00 10 3a.
I (10607) ublox.cpp: Device opened: -8
I (10607) ublox.cpp: GNSS driver initialised: 0
and the corresponding log when the code works:
I (397) main_task: Started on CPU0
I (407) main_task: Calling app_main()
I (407) ublox.cpp: Port initialised: 0
I (407) ublox.cpp: I2C Port initialised: 0
I (407) ublox.cpp: Device initialised: 0
U_GNSS: initialising with ENABLE_POWER pin not connected, transport type I2C.
U_GNSS: sent command b5 62 0a 06 00 00 10 3a.
U_GNSS: decoded UBX response 0x0a 0x06: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [body 120 byte(s)].
U_GNSS: sent command b5 62 06 04 04 00 00 00 09 00 17 76.
U_GNSS: sent command b5 62 0a 06 00 00 10 3a.
U_GNSS: decoded UBX response 0x0a 0x06: 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [body 120 byte(s)].
I (2617) ublox.cpp: Device opened: 0
I (2627) ublox.cpp: GNSS driver initialised: 0
If I'm reading this correctly the device never responds to the initial command.
Do you have a spare MCU pin you could connect to the
RESET_N
line of the GNSS chip and you could toggle low and then high again to get the I2C bus into the correct state at start of day?
I do have spare MCP pins, I'll just need to connect one of them. I'm meeting with the more hardware focused people on my team tomorrow to look at this with a logic analyzer like @mazgch suggested. We'll probably connect the RESET_N line at the same time.
If I'm reading this correctly the device never responds to the initial command.
Indeed, that is a correct interpretation. Let us know how it goes after you have discussed with the HW-focused types.
I've just come back from meeting the HW people. We attached an oscilloscope to the SDA line and could see the initial command from the MCU on the wire, it received no response. However, we did some testing and found that introducing a 2 second delay (std::this_thread::sleep_for(2s)
) before calling ublox::init
fixes the issue. Could the ZED-F9P just be slow to start?
If ubxlib
is aware that the GNSS device needs to be powered-up before use (i.e. if the value of pinEnablePower
in the uDeviceCfgGnss_t structure is non-negative), rather than the GNSS device always being powered, then, after ubxlib
has set pinEnablePower
to the correct state, it will wait for U_GNSS_POWER_UP_TIME_SECONDS (i.e. 2 seconds) before it tries to talk to the GNSS device.
So yes, if the whole board, with the GNSS device on it is being powered up and the MCU on that board is able to talk to the GNSS device in less than 2 seconds from power-up it would be wise to add such a delay.
Connecting the RST pin and setting the appropriate value for pinEnablePower
sems to be working perfectly. Thank you for the help.
I'm using ESP-IDF 5.2 with a TTGO LORA32 board connected to a ublox ZED-F9P via I2C. I call the following init function at the start of
app_main
to initialise the GNSS module and then proceed to simply call uGnssPosGet in a loop. Sometimes it simply works, others it returns -2 (U_ERROR_COMMON_NOT_INITIALISED). When it fails pressing the reset button on the TTLooking at the code of uGnssPosGet it appears to me to only return U_ERROR_COMMON_NOT_INITIALISED whengUGnssPrivateMutex == NULL
GO usually makes it work again. This is is the part that really confuses me: if something is wrong with my code, why does rebooting the device make it work, if nothing is wrong with my code why does it only work some of the time?Just in case its related:
uPortI2cOpen
seems to return-5
, but that seems to happen regardless of whetheruGnssPosGet
is working or failing.