u-blox / ubxlib

Portable C libraries which provide APIs to build applications with u-blox products and services. Delivered as add-on to existing microcontroller and RTOS SDKs.
Apache License 2.0
287 stars 82 forks source link

Error settings socket options for ICMP during Ping #189

Closed elektron314 closed 4 months ago

elektron314 commented 5 months ago

Hello, The question is very simple, how can I get the ping error number 106, Error settings socket options for ICMP doing Ping if the Ping itself doesn't need any socket? Ping command, which uses the Internet Control Message Protocol (ICMP), does not directly involve the creation of a socket in the same way that other network communication methods might. ICMP operates at a lower level than TCP or UDP and is used for network diagnostics rather than establishing a connection. But it's not so simple. I'm receiving one positive answer: +UUPING: 1,32,"www.google.com","172.217.20.196",109,160 and then right after that +UUPINGER: 106

RobMeades commented 5 months ago

Hi there: for our information, can you indicate which WiFi module this is with?

elektron314 commented 5 months ago

Oh, sorry :) It's not Wi-Fi. I'm using SARA-R510M8S-01B with nRF+Zephyr. Ping through LTE-M1 network.

RobMeades commented 5 months ago

Oh! Not so much a ubxlib question then (since ubxlib does not send this), more a module behaviour question.

The AT command manual uses phrases like "If the first +UUPING URC reports...", so I'd guess that you may receive multiple ping responses and the second one is indicating that the PDP context has been gone (which appears to be what 10 indicates)...?

elektron314 commented 5 months ago

omg, I'm so sorry, it's very late night already and I needed to go to sleep instead of posting. There's a typo, not 10 but 106 Updated that post.

RobMeades commented 5 months ago

Ah, now I see, 106 == Error creating socket for ICMP. ICMP is a protocol and so would likely be opened as a socket type? Not really sure I'm afraid. I will ask internally what it should be interpreted as.

RobMeades commented 5 months ago

It seems that AT+UPING, by default, performs four pings, so likely the second one has failed because of a connectivity issue.

The full command: AT+UPING=<remote_host>[,<retry_num>,<p_size>,<timeout>,<ttl>] allows you to set <retry_num> to some other value (1 to 64); it also allows you to set the ping size (<p_size>, range 4 to 1460, default 32), the response timeout in milliseconds (<timeout>, range 10 to 60,000, default 5000) and the time to live (<ttl>, range 1 to 255, default 32).

elektron314 commented 5 months ago

You are right, yes, exactly four.

I use default settings for the ping command but it doesn't matter, the result is the same.

If it is a connectivity issue why is it always happening the same? only 1 result.

The main question is why is it even possible to get errors about the sockets if the ICMP doesn't use them?

RobMeades commented 5 months ago

The main question is why is it even possible to get errors about the sockets if the ICMP doesn't use them?

Well, error code 106 is, as written in the AT manual, Error creating socket for ICMP, so either (a) the wording is somehow incorrect or (b) a socket is being used for ICMP. A search on the internet shows examples of (b) (e.g. this one), it would seem to be an implementation matter, down to how the code inside the module that handles ICMP happens to be implemented in relation to the IP stack within the module.

EDIT: having just looked at the code, (b) is what the module is doing, for instance:

socket(bind_if_addr->ss_family, SOCK_RAW, IPPROTO_ICMP);

If it is a connectivity issue why is it always happening the same? only 1 result.

I suspect the only way we will find that out is to set you up for logging the internal behaviour of the module but that will be quite involved, so before we do that, some questions:

elektron314 commented 5 months ago

Let's do that

elektron314 commented 5 months ago

I can't find this socket(bind_if_addr->ss_family, SOCK_RAW, IPPROTO_ICMP); Where is that and how can I use this knowledge in resolving my problem? :)

RobMeades commented 5 months ago

I can't find this socket(bind_if_addr->ss_family, SOCK_RAW, IPPROTO_ICMP); Where is that and how can I use this knowledge in resolving my problem? :)

No, sorry, that is the module code, I was simply posting it to confirm that the module is opening a raw ICMP socket to perform the ping; a socket is being used.

RobMeades commented 5 months ago

Please make contact with me at rob.meades@u-blox.com and we can go through the instructions for setting up logging that way rather than trying to do it here.

elektron314 commented 5 months ago

I will do my test with the HW tomorrow as I need to go to the office and do the physical reconnection as you described in your email.

But in the meantime, by the way, I have to mention that if I turn off uSockGetHostByName just right before my ping command, I don't even get any positive response from the ping, only +UUPINGER: 106. It seems like uSockGetHostByName opens somehow a required socket that ping uses and after it closes it is not immediately so that 1 ping could be served before the required socket closes.

RobMeades commented 5 months ago

Understood: obviously none of this makes any sense, we need to see what the module is seeing.

RobMeades commented 5 months ago

Assuming you're using the ubxlib uAtCient to implement your AT+UPING function, could you post the code for that also, in case there's something up with the sequencing?

elektron314 commented 5 months ago

Sure, after the connection I execute ping function, here it is

static void UUPING_urc(uAtClientHandle_t atHandle, void* pParameter) {
    char ipStr[64];
    uPingContext_t* pPingCtx = (uPingContext_t*)pParameter;

    // default to error
    pPingCtx->status = U_PING_STATUS_ERROR;
    // retry_num
    uAtClientReadInt(atHandle);
    // p_size
    uAtClientReadInt(atHandle);
    // remote_hostname
    uAtClientReadString(atHandle, NULL, 256, false);
    // remote_ip
    if (uAtClientReadString(atHandle, ipStr, sizeof(ipStr), false) > 0) {
        int32_t tmp;
        // Use a temporary output variable to avoid "Attempt to cast away " lint issue
        uSockAddress_t tmpAddr;
        tmp = uSockStringToAddress(ipStr, &tmpAddr);
        pPingCtx->resultSockAddress = tmpAddr;
        if (tmp == (int32_t)U_ERROR_COMMON_SUCCESS) {
            pPingCtx->status = U_PING_STATUS_IP_RECEIVED;
        }
    }
    // ttl
    uAtClientReadInt(atHandle);
    // rtt
    uAtClientReadInt(atHandle);
}

static void UUPINGER_urc(uAtClientHandle_t atHandle, void* pParameter) {
    (void)atHandle;
    uPingContext_t* pPingCtx = (uPingContext_t*)pParameter;

    // we received an error
    pPingCtx->status = U_PING_STATUS_ERROR;
}

void UbloxManager::ping(uDeviceHandle_t devHandle, char const* pHostName) {
    int32_t tmp;
    uAtClientHandle_t atHandle;
    uCellPrivateInstance_t* pInstance;
    static uPingContext_t gPingContext;

    // Check parameters
    if (pHostName != NULL) {
        pInstance = pUCellPrivateGetInstance(devHandle);
        atHandle = pInstance->atHandle;

        // Register the ping URCs
        uAtClientSetUrcHandler(pInstance->atHandle, "+UUPING:", UUPING_urc, (void*)&gPingContext);
        uAtClientSetUrcHandler(pInstance->atHandle, "+UUPINGER:", UUPINGER_urc, (void*)&gPingContext);

        // Send UPING AT command
        gPingContext.status = U_PING_STATUS_WAITING;

        uAtClientLock(atHandle);

        // AT command starts with
        uAtClientCommandStart(atHandle, "AT+UPING=");

        // remote_host
        uAtClientWriteString(atHandle, pHostName, true);

        // retry_num
        uAtClientWriteInt(atHandle, 10);

        // p_size
        uAtClientWriteInt(atHandle, 64);

        // timeout
        uAtClientWriteInt(atHandle, 10);

        // ttl
        uAtClientWriteInt(atHandle, 64);

        uAtClientCommandStopReadResponse(atHandle);
        uAtClientUnlock(atHandle);

        // Regardless if there is an error or not we unregister the URC
        if (pInstance) {
            uAtClientRemoveUrcHandler(pInstance->atHandle, "+UUPING:");
            uAtClientRemoveUrcHandler(pInstance->atHandle, "+UUPINGER:");
        }
    }
}
RobMeades commented 5 months ago

Thanks: the only thing I'd say is that you are removing the URC handler here:

        // Regardless if there is an error or not we unregister the URC
        if (pInstance) {
            uAtClientRemoveUrcHandler(pInstance->atHandle, "+UUPING:");
            uAtClientRemoveUrcHandler(pInstance->atHandle, "+UUPINGER:");
        }

...pretty much immediately after getting the OK back from the AT+UPING command; you probably only want to do that at the end of your application, or at least after a timeout when you are sure you have received all of the expected responses.

elektron314 commented 5 months ago

Attachment available until 29. Feb 2024 No, but now I put a delay of 30 seconds before those handlers' removers and the result is better! But anyway, ping gives strange results like this:

+UUPING: 1,64,"www.google.com","172.217.20.196",105,124

+UUPING: 2,64,"www.google.com","172.217.20.196",105,-1

+UUPING: 3,64,"www.google.com","172.217.20.196",0,-1

+UUPING: 4,64,"www.google.com","172.217.20.196",0,-1

+UUPING: 5,64,"www.google.com","172.217.20.196",0,-1

+UUPING: 6,64,"www.google.com","172.217.20.196",0,-1

+UUPING: 7,64,"www.google.com","172.217.20.196",0,-1

+UUPING: 8,64,"www.google.com","172.217.20.196",0,-1

+UUPING: 9,64,"www.google.com","172.217.20.196",0,-1

+UUPING: 10,64,"www.google.com","172.217.20.196",0,-1

TTL and RTT are 0 and -1

and I'm sending you this trace as well.

RobMeades commented 5 months ago

Thanks: that looks much better, I have downloaded those traces and deleted the links from your post.

Could the UUPINGER: be occurring because your application has moved on and closed down the link before there has been a chance for the responses to arrive?

elektron314 commented 5 months ago

I set a 30s delay before the deletion of handlers (uAtClientRemoveUrcHandler) and now there is no more UUPINGER #106, so yes, the "socket" problem was because of the immediate deletion of those handlers.

But why do I see these results of 10 pings, I don't understand.

RobMeades commented 5 months ago

Understood.

But why do I see these results of 10 pings, I don't understand.

We will need to wait for the relevant expert to look at the log, I will update this issue when I know more.

elektron314 commented 4 months ago

The problem is solved by adding more time to wait for ping results

RobMeades commented 4 months ago

Understood: I've not heard anything back from the person who was looking into this internally; I will prod them and update this issue when they get back to me.