u-blox / ubxlib

Portable C libraries which provide APIs to build applications with u-blox products and services. Delivered as add-on to existing microcontroller and RTOS SDKs.
Apache License 2.0
287 stars 82 forks source link

Issues Initializing MAX-M10S GNSS on ZephyrRTOS #196

Closed AarC10 closed 4 months ago

AarC10 commented 5 months ago

Hello.

Currently trying to integrate ubxlib into my software which utilizes ZephyrRTOS on an STM32 F446RE. I've worked through a few issues, but my teammate and I seem to be stumped on getting uDeviceOpen to work. At this point, we have ran into a U_ERROR_COMMON_PLATFORM (-8) error code both when trying to use I2C or UART. The -8 error occurs from trying to power on the GNSS and doing the controlled GNSS hot start. Trying to do a GNSS hot start which runs uGnssPrivateSendReceiveUbxMessage yields a U_ERROR_COMMON_DEVICE_ERROR (-10). We can confirm the chip works since we can probe the UART lines and see messages being output from the GPS chip regardless of the software being ran. Can also confirm we were able to communicate with the chip and get GPS data before using Zephyr and ubxlib and were using our own custom drivers and software over I2C. Was hoping to get some support on determining what the issue could be and if there needs to be any improvements to our software, Here is some of our relevant code, debug logs and gdb showing our system state. Thank you for your time.

Calling Code:

int init_maxm10s(gnss_dev_t *dev) {
    int ret = uPortInit();
    if (ret != 0) {
        LOG_ERR("uPortInit() returned %d\n", ret);
        return ret;
    }

    ret = uPortI2cInit();
    if (ret != 0) {
        LOG_ERR("uPortI2cInit() returned %d\n", ret);
        return ret;
    }

    ret = uDeviceInit();
    if (ret != 0) {
        LOG_ERR("uDeviceInit() returned %d\n", ret);
        return ret;
    }

    ret = uDeviceOpen(NULL, &dev->gnssHandle);
    if (ret != 0) {
        LOG_ERR("uDeviceOpen() returned %d\n", ret);
        return ret;
    }

    return 0;

}

Device Tree:

/ {
    cfg-device-gnss {
            compatible = "u-blox,ubxlib-device-gnss";
            status = "okay";
            transport-type = "i2c1";
    //        transport-type = "usart2";
            module-type = "U_GNSS_MODULE_TYPE_M10";
        };

    aliases {
        ubxlib-uart2 = &usart2;
    };
}

&usart2 {
    pinctrl-0 = <&usart2_tx_pa2 &usart2_rx_pa3>;
    pinctrl-names = "default";
    current-speed = <9600>;
    status = "okay";
};

&i2c1 {
    pinctrl-0 = <&i2c1_scl_pb6 &i2c1_sda_pb7>;
    pinctrl-names = "default";
    status = "okay";
    clock-frequency = <I2C_BITRATE_FAST>;

    max0: maxm10s@42 {
        status = "okay";
        compatible = "u-blox,maxm10s";
        reg = <0x42>;
    };
};

Config:

# U-Blox
CONFIG_GNSS=y
CONFIG_GNSS_SATELLITES=y
CONFIG_GNSS_NMEA0183=y
CONFIG_UBXLIB=y
CONFIG_UBXLIB_GNSS=y

# Need the following for U-Blox as well
CONFIG_INIT_STACKS=y
CONFIG_THREAD_STACK_INFO=y
CONFIG_THREAD_NAME=y
CONFIG_KERNEL_MEM_POOL=y
CONFIG_HEAP_MEM_POOL_SIZE=20000
CONFIG_MINIMAL_LIBC_MALLOC=n

Screenshot_20240204_225315 Screenshot_20240204_230159 Screenshot_20240204_221819

RobMeades commented 5 months ago

Hi there, and sorry you're having trouble with this.

Your code and your configuration looks good to me and, from the evidence of the U_PORT_BOARD_CFG prints, the configuration seems to be being adopted by this code. The failure cases between UART and I2C are different in that, in the UART case, this code believes it has sent b5 62 0a 06 00 00 10 3a and has timed-out after waiting 10 seconds for the response, while for the I2C case the "sent command:" debug print is missing and the error is pretty much immediate.

That said, it looks as though your gdb screenshot is from the I2C case, which says that, even in that case, it is still getting all the way to trying to send the command, which implies that this line is likely returning U_ERROR_COMMON_DEVICE_ERROR, which can only happen if i2c_transfer() is returning an error.

As a next data point, at least for the I2C case, it is probably worth trying to find out what i2c_transfer() is returning.

Do you happen to be able to sniff the I2C or UART lines at all? Not necessary yet, but might turn out so. EDIT: I see you have already said that you can do this above. It would be interesting to know, in either the UART or I2C cases, if you can see the MCU sending the command to the GNSS device.

RobMeades commented 5 months ago

Looking at your device tree stuff, I notice you have:

   max0: maxm10s@42 {
        status = "okay";
        compatible = "u-blox,maxm10s";
        reg = <0x42>;
    };

...in the I2C entry, which I guess is for another driver inside Zephyr. Is it possible that driver is somehow still trying to do something, might somehow be active? Not that it should cause i2c_transfer() to return an error, of course, just thought I'd point it out.

RobMeades commented 5 months ago

Do you happen to be able to sniff the I2C or UART lines at all? Not necessary yet, but might turn out so.

I see you have already said that you can do this above. It would be interesting to know, in either the UART or I2C cases, if you can actually see the MCU sending the command b5 62 0a 06 00 00 10 3a to the GNSS device.

Naquino14 commented 5 months ago

Looking at your device tree stuff, I notice you have:

   max0: maxm10s@42 {
        status = "okay";
        compatible = "u-blox,maxm10s";
        reg = <0x42>;
    };

...in the I2C entry, which I guess is for another driver inside Zephyr. Is it possible that driver is somehow still trying to do something, might somehow be active? Not that it should cause i2c_transfer() to return an error, of course, just thought I'd point it out.

Aaron and I were working on this together. When I removed it (at the time I was also only calling uDeviceOpen in init_maxm10s per the requirements of /port/platform/zephyr/Readme.md) uDeviceOpen returned -2. It would be helpful to test again though, so I will pull Aaron's changes and delete that node in the tree and let you know what I find.

AarC10 commented 5 months ago

To follow up, we added some inits so the mutexes would get initialized so it stopped returning -2, so this no longer seems to be an issue. We also are not using any other outside driver, so I guess that the GPS definition inside i2c1 is unnecessary. We can use a Saleae to try probing UART to double check, but probing I2C will not be possible on this board. We could also try to get a breakout and write an overlay for a F446 nucleo and probe those lines and report back then if nothing else seems to be of concern.

RobMeades commented 5 months ago

Thanks: there really shouldn't be a need to probe HW lines for something this basic: there's just something misaligned somewhere between what ubxlib thinks is connected, what Zephyr thinks is connected, and what is actually connected.

For I2C, getting the return value of i2c_transfer() should tell us quite a lot I hope.

djfurie commented 4 months ago

Can you try adding i2c-address = <0x42>; to your device tree definition for the max0?

    gps {
        status = "okay";
        compatible = "u-blox,ubxlib-device-gnss";
        transport-type = "i2c2";
        i2c-already-open;
        i2c-address = <0x42>;
        module-type = "U_GNSS_MODULE_TYPE_M10";
    };

I believe that there may be a bug in the DT bindings that default the i2c address to 42 instead of 0x42

Edit: fixing the address only got me a few lines further in the code (to line 2372). However, without specifying 0x42 I was seeing NACKs, now I'm seeing valid traffic on I2C. But it times out while polling for whatever it's looking for.

Edit2: Am I interpreting the packet right? Class = 0x0A ID = 0x06? I'm not seeing this message in the documentation. image

gdb) where
#0  sendReceiveUbxMessage (pInstance=0x20008e48, messageClass=<optimized out>, messageId=<optimized out>, pMessageBody=<optimized out>, messageBodyLengthBytes=messageBodyLengthBytes@entry=0, pResponse=pResponse@entry=0x200052e0 <_k_thread_stack_mainThread_id+3424>)
    at /home/dan/projects/gps_project/firmware/deps/ubxlib/gnss/src/u_gnss_private.c:700
#1  0x0801b870 in uGnssPrivateSendReceiveUbxMessage (pInstance=pInstance@entry=0x20008e48, messageClass=messageClass@entry=10, messageId=messageId@entry=6, pMessageBody=pMessageBody@entry=0x0, messageBodyLengthBytes=messageBodyLengthBytes@entry=0, pResponseBody=<optimized out>, 
    pResponseBody@entry=0x20005310 <_k_thread_stack_mainThread_id+3472> "", maxResponseBodyLengthBytes=maxResponseBodyLengthBytes@entry=120) at /home/dan/projects/gps_project/firmware/deps/ubxlib/gnss/src/u_gnss_private.c:2555
#2  0x0801b9ea in uGnssPrivateSendOnlyCheckStreamUbxMessage (pInstance=pInstance@entry=0x20008e48, messageClass=messageClass@entry=6, messageId=messageId@entry=4, pMessageBody=pMessageBody@entry=0x200053c0 <_k_thread_stack_mainThread_id+3648> "", 
    messageBodyLengthBytes=messageBodyLengthBytes@entry=4) at /home/dan/projects/gps_project/firmware/deps/ubxlib/gnss/src/u_gnss_private.c:2358
#3  0x0800492a in uGnssPwrOn (gnssHandle=0x20008e08) at /home/dan/projects/gps_project/firmware/deps/ubxlib/gnss/src/u_gnss_pwr.c:578
#4  0x08004b16 in addDevice (gnssTransportHandle=gnssTransportHandle@entry=..., deviceTransportType=<optimized out>, pCfgGnss=pCfgGnss@entry=0x200054f0 <_k_thread_stack_mainThread_id+3952>, pDeviceHandle=pDeviceHandle@entry=0x200054e4 <_k_thread_stack_mainThread_id+3940>)
    at /home/dan/projects/gps_project/firmware/deps/ubxlib/common/device/src/u_device_private_gnss.c:181
#5  0x08004c3a in uDevicePrivateGnssAdd (pDevCfg=pDevCfg@entry=0x200054e8 <_k_thread_stack_mainThread_id+3944>, pDeviceHandle=pDeviceHandle@entry=0x200054e4 <_k_thread_stack_mainThread_id+3940>) at /home/dan/projects/gps_project/firmware/deps/ubxlib/common/device/src/u_device_private_gnss.c:303
#6  0x08002a6e in uDeviceOpen (pDeviceCfg=0x0, pDeviceHandle=pDeviceHandle@entry=0x20001ca0 <gps_device_handle>) at /home/dan/projects/gps_project/firmware/deps/ubxlib/common/device/src/u_device.c:289
#7  0x080070f2 in gps_init () at /home/dan/projects/gps_project/firmware/myprojectname/application/src/gps.c:39
#8  0x08006086 in mainThread () at /home/dan/projects/gps_project/firmware/myprojectname/application/src/main.c:58
#9  0x080078ce in z_thread_entry (entry=0x200008d4 <log_dynamic_app>, p1=0x0, p2=0x0, p3=0x0) at /home/dan/projects/gps_project/firmware/deps/zephyr/lib/os/thread_entry.c:48
#10 0xaaaaaaaa in ?? ()

So I'm very confident I'm looking at the same issue, and have confirmed that what is actually going out over the wire is wrong. I believe the intended message is a UBX-CFG-RST (0x06 0x04).

djfurie commented 4 months ago

I see in the code that it should be a UBX-MON-MSGPP that is being sent.

That type doesn't appear in the M10 software guide. Appears that it's not supported? https://content.u-blox.com/sites/default/files/u-blox-M10-SPG-5.10_InterfaceDescription_UBX-21035062.pdf

It's a defined message for the M8 series: https://content.u-blox.com/sites/default/files/products/documents/u-blox8-M8_ReceiverDescrProtSpec_UBX-13003221.pdf

RobMeades commented 4 months ago

I believe that there may be a bug in the DT bindings that default the i2c address to 42 instead of 0x42

Woohoo! Well spotted: when I end-to-end tested this I did so on SPI (thinking that was the most complex case) not on I2C so that would have got past me.

On the opening message sequence, this is what it should look like; this is a log from talking to an M10, so it was certainly there in this FW version (FWVER=SPG 5.10, PROTVER=34.10):

U_GNSS: sent command b5 62 0a 06 00 00 10 3a.
U_GNSS: decoded UBX response 0x0a 0x06: 0a 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [body 120 byte(s)].
U_GNSS: sent command b5 62 06 04 04 00 00 00 09 00 17 76.
U_GNSS: sent command b5 62 0a 06 00 00 10 3a.
U_GNSS: decoded UBX response 0x0a 0x06: 0c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [body 120 byte(s)].

I will investigate...

djfurie commented 4 months ago

Thanks for the message sequence. That will be helpful for debugging further, as I'm obviously a bit confused about what messages should be passing when =)

djfurie commented 4 months ago

Will revisit this tomorrow, but my last data point is that I don't seem to get any response after the UBX-MON-MSGPP is sent. I just observe reads of the 0xFD register, which returns 0xFF 0xFF, and then a lot of bytes read from the 0xFF register. This repeats for a while until things time out.

image

RobMeades commented 4 months ago

You're absolutely right that the MAX-M10 interface manual doesn't mention the UBX-MON-MSGPP message, which I hadn't noticed 'cos the MAX-M10 devices we have in the ubxlib test system, devices which report FWVER=SPG 5.10, PROTVER=34.10 (which are the versions the interface manual says it is for), respond to it.

EDIT: curiously the interface manual does include the configuration message to configure the rate of the UBX-MON-MSGPP message on I2C, SPI and UART (CFG-MSGOUT-UBX_MON_MSGPP_XXX).

Very odd. I will enquire with people who should know tomorrow.

RobMeades commented 4 months ago

One more thing: in your I2C trace, a read from register 0xFD produces the response 0xFF 0xFF.

If there were no data to be read, I would expect it to produce the response 0x00 0x00; see below a portion from an I2C trace I happened to take a few days ago with a MAX-M10S:

image

The response you are getting would appear to mean that there are at least 65535 bytes of data to be read, which might take a little while, possibly explaining the timeout?

I guess a fix for this would be to not do the UBX-MON-MSGPP either side of the UBX-CFG-RST, i.e. to call uGnssPrivateSendOnlyStreamUbxMessage() instead of uGnssPrivateSendOnlyCheckStreamUbxMessage() just here:

https://github.com/u-blox/ubxlib/blob/980e32707d60ab0af1116d18eef781056e52398f/gnss/src/u_gnss_pwr.c#L575-L578

That would make it more of a "best effort" reset but it would then work in situations where the GNSS chip might have been left on and accumulating messages for some time before we are attached to it.

RobMeades commented 4 months ago

A fix to apply the correct default I2C address is available in a preview branch here:

https://github.com/u-blox/ubxlib/tree/preview_fix_zephyr_dts_gnss_rmea

...should anyone need it. I will update this issue when the fix is pushed to the master branch and will delete the preview branch some time after that.

EDIT: fix now pushed to master here in commit fdc78b8d971dac3dc3304ef9b05039e809ce7371.

cturvey commented 4 months ago

UBX-MON-MSGPP is supported on SPG 5.00 and SPG 5.10

0x20910196 CFG-MSGOUT-UBX_MON_MSGPP_I2C
0x20910197 CFG-MSGOUT-UBX_MON_MSGPP_UART1
0x2091019A CFG-MSGOUT-UBX_MON_MSGPP_SPI
RobMeades commented 4 months ago

Thanks for confirming @cturvey, seems likely to be a documentation fault, I have raised this internally.

The fix for the default I2C address issue is now pushed to master here in commit fdc78b8d971dac3dc3304ef9b05039e809ce7371: I will wait to delete the preview branch until some time next week, just in case anyone has begun using it.

djfurie commented 4 months ago

In my situation, I'm asserting the hardware reset line just prior to initialization, so I wouldn't expect a full buffer:

int gps_init() {
    int rc;

    // Toggle the GPS reset pin
    gpio_pin_configure_dt(&gps_enable, GPIO_OUTPUT_ACTIVE);
    k_sleep(K_MSEC(10));
    gpio_pin_set_dt(&gps_enable, 0);
    k_sleep(K_MSEC(500));

    rc = uPortInit();
    if (rc != 0) {
        return rc;
    }

    rc = uPortI2cInit();
    if (rc != 0) {
        return rc;
    }

    rc = uDeviceInit();
    if (rc != 0) {
        return rc;
    }

    rc = uDeviceOpen(NULL, &gps_device_handle);
    if (rc != 0) {
        return rc;
    }

    rc = uGnssPosGetStreamedStart(gps_device_handle, U_GNSS_POS_STREAMED_PERIOD_DEFAULT_MS, callback);

    return rc;
}

Based on the info here, I'm going to look at this as a potential hardware issue (maybe a conflicting i2c address? I have no clue) for now. I'll be curious to know if the i2c-address fix for the DT binding fixes @AarC10's issue. I'll chime back in if I discover anything interesting.

djfurie commented 4 months ago

As another data point, I see some of the expected results when I completely power cycle my board. If I simply try to reset the module without the power cycle it's failing as above...

Output after a cold reset:

U_GNSS: sent command b5 62 0a 06 00 00 10 3a.
U_GNSS: decoded UBX response 0x0a 0x06: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [body 120 byte(s)].
U_GNSS: sent command b5 62 06 04 04 00 00 00 09 00 17 76.
U_GNSS: sent command b5 62 0a 06 00 00 10 3a.

Note that I'm not seeing a response to the second MSGPP command.

Output after a warm-reset:

U_GNSS: sent command b5 62 0a 06 00 00 10 3a.

No response even to the first command.

RobMeades commented 4 months ago

Very strange. FYI, in our ubxlib regression test environment, for consistency of testing, all GNSS devices have their RESET_N line connected to the MCU board that they are working with and that line will be pulled low for 500 ms then released and we wait for at least 2 seconds (probably a lot longer than that as we run the port tests first) before we move on with testing.

It would be interesting to know if, in the fail cases, you are getting that 0xFF 0xFF response to the read from register 0xFD and, when that happens, what does the MAX-M10S actually send when the ubxlib code tries to read the stuff that it thinks is queued. Of course, we don't monitor the I2C lines directly during a ubxlib regression test but there is also no retrying going on anywhere so if it failed we would see it, on the other hand if it said 0xFF 0xFF for a little while then switched to something sensible within the message response timeout we wouldn't notice either.

ValterMinute commented 4 months ago

I have an issue with a similar configuration (zephyr, MIA-M10Q and NRF52832 connected via I2C): https://github.com/u-blox/ubxlib/issues/199 Those may be completely different problems, but I am linking it here, in case a similar setup may help understanding the issue.

ValterMinute commented 4 months ago

@djfurie sorry, but I checked this issue before posting mine and in the meantime you seem to report exactly what I see in my own setup...

Naquino14 commented 4 months ago

Looking at your device tree stuff, I notice you have:

   max0: maxm10s@42 {
        status = "okay";
        compatible = "u-blox,maxm10s";
        reg = <0x42>;
    };

...in the I2C entry, which I guess is for another driver inside Zephyr. Is it possible that driver is somehow still trying to do something, might somehow be active? Not that it should cause i2c_transfer() to return an error, of course, just thought I'd point it out.

Aaron and I were working on this together. When I removed it (at the time I was also only calling uDeviceOpen in init_maxm10s per the requirements of /port/platform/zephyr/Readme.md) uDeviceOpen returned -2. It would be helpful to test again though, so I will pull Aaron's changes and delete that node in the tree and let you know what I find.

Hi just following up. Sorry for the radio silence, Aaron has been busy with work and I have been busy with classes. I have updated my branch in our flight software repo to start testing y'alls suggestions.

@RobMeades Deleting the max0 node in the devicetree has no effect, and uDeviceOpen still returns -8.


@djfurie I have tried your suggestion here so my max0 node now looks like:

    max0: maxm10s@42 {
        status = "okay";
        compatible = "u-blox,maxm10s";
        reg = <0x42>;
        i2c-address = <0x42>;
        transport-type = "i2c1";
        module-type = "U_GNSS_MODULE_TYPE_M10";
        i2c-already-open;
    };

Unfortunately, uDeviceOpen` still returns -8. If you suspect its a hardware issue I can ask my team to see if I can share the schematics of our custom board, and link our hardware lead to this issue.

Naquino14 commented 4 months ago

Very strange. FYI, in our ubxlib regression test environment, for consistency of testing, all GNSS devices have their RESET_N line connected to the MCU board that they are working with and that line will be pulled low for 500 ms then released and we wait for at least 2 seconds (probably a lot longer than that as we run the port tests first) before we move on with testing.

It would be interesting to know if, in the fail cases, you are getting that 0xFF 0xFF response to the read from register 0xFD and, when that happens, what does the MAX-M10S actually send when the ubxlib code tries to read the stuff that it thinks is queued. Of course, we don't monitor the I2C lines directly during a ubxlib regression test but there is also no retrying going on anywhere so if it failed we would see it, on the other hand if it said 0xFF 0xFF for a little while then switched to something sensible within the message response timeout we wouldn't notice either.

Out of curiosity, I checked in our schematic if we had this pin connected, and we do. I will attempt to make a node for this pin and reset the chip before calling uDeviceOpen.

RobMeades commented 4 months ago

Hi again: can you post the debug output printed by ubxlib with i2c-address = <0x42> set?

EDIT: assuming it is unchanged from what you originally posted above, the only place in the ubxlib code where I think U_ERROR_COMMON_PLATFORM (-8) would be returned is here:

https://github.com/u-blox/ubxlib/blob/fdc78b8d971dac3dc3304ef9b05039e809ce7371/port/platform/zephyr/src/u_port_i2c.c#L149

...if it can't find your i2c1 binding, and the only reason I can see that it would do that is if CONFIG_I2C is not defined for the build (which is what brings in the Zephyr I2C code) and, indeed, looking at what is posted originally above, it doesn't seem to be...?

Naquino14 commented 4 months ago

Hey! I haven't added the resetting code yet, and instead commented out the max0 node and edited the cfg-device-gnss node to look like this:


    cfg-device-gnss {
        compatible = "u-blox,ubxlib-device-gnss";
        status = "okay";
        transport-type = "i2c1";
        module-type = "U_GNSS_MODULE_TYPE_M10";
        i2c-already-open;         // added this
        i2c-address = <0x42>; // and this
    };

Im currently stepping through gdb to see what goes wrong, as its either hanging or zephyr is exiting the init thread.

...if it can't find your i2c1 binding, and the only reason I can see that it would do that is if CONFIG_I2C is not defined for the build

CONFIG_I2C is set to y here.

Naquino14 commented 4 months ago

@RobMeades it sends b5 62 0a 06 00 00 10 3a and then says uDeviceOpen returned -8.

AarC10 commented 4 months ago

Screenshot_20240210_121050

RobMeades commented 4 months ago

Thanks: so it seems likely we are in uGnssPwrOn() and the error code we are seeing has been set just here:

https://github.com/u-blox/ubxlib/blob/fdc78b8d971dac3dc3304ef9b05039e809ce7371/gnss/src/u_gnss_pwr.c#L569

uGnssPrivateSendOnlyCheckStreamUbxMessage() has called uGnssPrivateSendReceiveUbxMessage() to send UBX-MON-MSGPP. The send part of that has worked, or we wouldn't see the U_GNSS: sent command... debug print, so the I2C address is now correct, but it looks like the receiveUbxMessageStream() call a few lines lower down has found no response; from the timestamps we seem to have hit the 10ish second timeout.

This makes the situation look somewhat similar to @djfurie, so it would be interesting to see if resetting the GNSS device a few seconds before you begin would put it into a more receptive (transmittive, I suppose :-)) state. The odd thing is that the GNSS device is responsive, because it has I2C-acked the poll for UBX-MON-MSGPP, it just appears not to be sending back the UBX-MON-MSGPP response for some reason.

At this point a trace of the I2C lines with something like a Saleae probe or equivalent, if you can do so, probably would be the right thing to obtain. That, or you could put in a temporary hack to print out the value of errorCodeOrReceiveSize just here:

https://github.com/u-blox/ubxlib/blob/fdc78b8d971dac3dc3304ef9b05039e809ce7371/gnss/src/u_gnss_private.c#L2007

RobMeades commented 4 months ago

EDIT: sorry, not there, one line lower down, after it has been populated with the contents of buffer[]:

https://github.com/u-blox/ubxlib/blob/fdc78b8d971dac3dc3304ef9b05039e809ce7371/gnss/src/u_gnss_private.c#L2008

Naquino14 commented 4 months ago

Is it possible that the GNSS device is faulty?

RobMeades commented 4 months ago

Anything is possible, just not likely since, from the debug prints, the GNSS device's I2C interface is working (i.e. it has acknowledged b5 62 0a 06 00 00 10 3a).

Naquino14 commented 4 months ago

I wish it was possible at this time to probe the I2C lines... Our hardware team didn't include test points for it. I will carve out some time Today and or Tuesday to debug and find the contents of that buffer on line, 2008. I also still haven't tried to reset the chip beforehand so I will try my best to do that as well. By the way, thank you very much for your patience!

RobMeades commented 4 months ago

will carve out some time Today and or Tuesday to debug and find the contents of that buffer on https://github.com/u-blox/ubxlib/issues/196#issuecomment-1937116977

If, when you do that, it turns out to contain either 0xFFFF or 0, you could try leaving the GNSS device for, say 10 seconds, after you have reset it, to see if there is some kind of start-up latency that needs to be accounted for.

cturvey commented 4 months ago

It is possible to remap the I2C SCL/SDA to the UART1 RX/TX, either temporarily, or permanently. This can be helpful with some of the off-the-shelf UAV modules which only export the UART1, and a lot of systems want to use I2C due to lack of UARTs on their MCU, or simply preferring the multi-drop, simple cabling of I2C

I2C on UART pins PIO (TX=SCL, RX=SDA)
 0x10510003 CFG-I2C-ENABLED = 1
 0x10510004 CFG-I2C-REMAP = 1
 0x10520005 CFG-UART1-ENABLED = 0

Or with UART Enabled but UART/I2C pins swapped (SWAP TX/SCL RX/SDA)
 0x10520005 CFG-UART1-ENABLED = 1  
 0x10520004 CFG-UART1-REMAP = 1
 0x10510003 CFG-I2C-ENABLED = 1
 0x10510004 CFG-I2C-REMAP = 1

https://portal.u-blox.com/s/question/0D52p0000E8WgehCQC/how-to-talk-to-maxm10s-over-i2c https://portal.u-blox.com/s/question/0D52p0000DcUjwwCQC/how-to-check-if-ubxm10050kb-device-contains-flash

RobMeades commented 4 months ago

Thanks @cturvey: that's quite a cool feature; whether we could make use of it in this case would depend upon how the HW happens to look in respect of access to the UART HW.

cturvey commented 4 months ago

Yes was more of a response to "not wired to test-points" situations. But yes would require UART1 access to effect.

Naquino14 commented 4 months ago

It is possible to remap the I2C SCL/SDA to the UART1 RX/TX, either temporarily, or permanently. This can be helpful with some of the off-the-shelf UAV modules which only export the UART1, and a lot of systems want to use I2C due to lack of UARTs on their MCU, or simply preferring the multi-drop, simple cabling of I2C

I2C on UART pins PIO (TX=SCL, RX=SDA)
 0x10510003 CFG-I2C-ENABLED = 1
 0x10510004 CFG-I2C-REMAP = 1
 0x10520005 CFG-UART1-ENABLED = 0

Or with UART Enabled but UART/I2C pins swapped (SWAP TX/SCL RX/SDA)
 0x10520005 CFG-UART1-ENABLED = 1  
 0x10520004 CFG-UART1-REMAP = 1
 0x10510003 CFG-I2C-ENABLED = 1
 0x10510004 CFG-I2C-REMAP = 1

https://portal.u-blox.com/s/question/0D52p0000E8WgehCQC/how-to-talk-to-maxm10s-over-i2c https://portal.u-blox.com/s/question/0D52p0000DcUjwwCQC/how-to-check-if-ubxm10050kb-device-contains-flash

@AarC10

AarC10 commented 4 months ago

If, when you do that, it turns out to contain either 0xFFFF or 0, you could try leaving the GNSS device for, say 10 seconds, after you have reset it, to see if there is some kind of start-up latency that needs to be accounted for.

Thank you for the suggestions. I implemented code for resetting the chip. It managed to initialize once, but every other time I reset the entire board, it will fail. Been playing with different sleep values. Currently, we hold the reset pin low for 100 ms, with around a 10ms delay before calling our inits. In the below screenshot, you can see it work the first try, but when I reset the board it goes into a loop trying to reset the GPS chip and re-run initialization (which eventually manages to initialize). However, looping until it works is probably not the most ideal since its non deterministic. Any other suggestions?

Screenshot_20240217_094633

RobMeades commented 4 months ago

Thanks: I think maybe we're focusing on the resetting too much. I only suggested it while we are trying to debug what appears to be an I2C issue without having access to the I2C lines themselves, just to eliminate any strangeness related to there being a load of data inside the MAX-M10S to be got out of the way before the wanted data turns up.

For the purposes of our experiment, I would recommend:

(a) resetting the GNSS device, as you have done, waiting a couple of seconds to be quite sure the GNSS device is booted etc., then see whether you get responses to the I2C commands that ubxlib sends at device open.

(b) printing out the value of errorCodeOrReceiveSize just after this line:

https://github.com/u-blox/ubxlib/blob/fdc78b8d971dac3dc3304ef9b05039e809ce7371/gnss/src/u_gnss_private.c#L2008 (c) printing out the return value of returned by i2c_transfer() just here:

https://github.com/u-blox/ubxlib/blob/9ce2d92bb594f7d4010964a57ca2a28d9cf902f9/port/platform/zephyr/src/u_port_i2c.c#L453

It might be that the return values will tell us something.

If not, then I can only conclude that there is something going wrong on the I2C HW lines - adequate pull ups, grounding, line length, that kind of thing. We happily use GNSS devices with 10 cm of flying lead to an MCU with (I think) 1k pull-up resistors but we do know that some of the STM32 MCUs are less reliable than others: in particular, under some circumstances some STM32 MCUs don't work well with I2C at 100 kHz clock:

https://www.st.com/resource/en/errata_sheet/es0206-stm32f427437-and-stm32f429439-line-limitations-stmicroelectronics.pdf

...so we usually run at 400 kHz, just in case (of course I can't say that this is in any way related to your issue).

Naquino14 commented 4 months ago

I can confirm we are running I2C at 400 kHz, and have 10k Ω pullups on our lines. cc @bch2857

RobMeades commented 4 months ago

10 k might not be enough: I'm pretty sure we use 1.2k.

RobMeades commented 4 months ago

...it says that the I2C clock is 100 kHz in the debug print out above. Remember that you get the default values applied by ubxlib, not the values from the &i2cx entry, so to get 400khz you would need to have an i2c-clock-hertz entry with value 400000.

Naquino14 commented 4 months ago

Yikes, let me change it really quick. If that doesn't work ill attempt this expiriment.

Naquino14 commented 4 months ago

I modified the i2c-clock-hertz entry to 400 kHz and looks like that might have done the trick! image

RobMeades commented 4 months ago

Interesting! Don't want to hex it but how many times have you tried?

Naquino14 commented 4 months ago

Im testing it more right now. Please hold...

Naquino14 commented 4 months ago

I guess I got lucky that time... Getting error -8 from uDeviceOpen() again

RobMeades commented 4 months ago

Is it possible to swap the 10k resistors for something smaller?

RobMeades commented 4 months ago

Actually, if you could do that, you could probably solder some test leads to the non-ground end...?

Naquino14 commented 4 months ago

Unfortunately everything is SMD and i'm not familiar with the board layout. Our hardware team would have to make that modification for us.