AVSystem / Anjay

C implementation of the client-side OMA LwM2M protocol
Other
189 stars 70 forks source link

Demo client not working: getaddrinfo() error: Name or service not known #66

Closed eabase closed 1 year ago

eabase commented 1 year ago

I'm trying to connect the demo client using:

./demo.exe --endpoint-name "blahblah" --server-uri coaps://lwm2m.blahblah.com:5684 --security-mode psk --identity-as-string "blahblah" --key-as-string "keykeykey" --lifetime 60

But nothing happens, I only get some cryptic errors in the log (just showing mos relevant):

...
2023-08-03 15:59:10.436657 WARNING [demo] [/cygdrive/c/mydev/pr_anjay/Anjay/demo/objects/device.c:435]: empty manufacturer part of endpoint name
2023-08-03 15:59:10.436686 DEBUG [demo] [/cygdrive/c/mydev/pr_anjay/Anjay/demo/objects/device.c:459]: manufacturer: Anjay; serial number: xxxxxxx
...
2023-08-03 15:59:10.437816 WARNING [demo] [/cygdrive/c/mydev/pr_anjay/Anjay/demo/firmware_update.c:519]: Invalid data in the firmware state persistence file
...
2023-08-03 15:59:10.440549 TRACE [anjay] [/cygdrive/c/mydev/pr_anjay/Anjay/src/core/servers/anjay_server_connections.c:244]: no SNI for /0/1, using defaults
2023-08-03 15:59:10.440587 INFO [anjay] [/cygdrive/c/mydev/pr_anjay/Anjay/src/core/servers/anjay_connections.c:647]: server /0/1: transport change: (none) -> U (uri: coaps://lwm2m.blahblah.com:5684)
2023-08-03 15:59:10.440676 DEBUG [anjay] [/cygdrive/c/mydev/pr_anjay/Anjay/src/core/servers/anjay_security_generic.c:437]: no ciphers configured for security IID 1, using TLS backend defaults
2023-08-03 15:59:10.440713 DEBUG [anjay] [/cygdrive/c/mydev/pr_anjay/Anjay/src/core/servers/anjay_security_generic.c:456]: server /0/1: security mode = 0
2023-08-03 15:59:10.499473 DEBUG [avs_net] [/cygdrive/c/mydev/pr_anjay/Anjay/deps/avs_commons/src/net/compat/posix/avs_compat_addrinfo.c:242]: getaddrinfo() error: Name or service not known; family == (avs_net_af_t) 2
2023-08-03 15:59:10.502845 WARNING [avs_net] [/cygdrive/c/mydev/pr_anjay/Anjay/deps/avs_commons/src/net/mbedtls/avs_mbedtls_socket.c:1187]: Could not restore session; performing full handshake
2023-08-03 16:01:13.543408 ERROR [avs_net] [/cygdrive/c/mydev/pr_anjay/Anjay/deps/avs_commons/src/net/mbedtls/avs_mbedtls_socket.c:1311]: handshake failed: -26624
2023-08-03 16:01:13.543574 ERROR [anjay] [/cygdrive/c/mydev/pr_anjay/Anjay/src/core/servers/anjay_connection_ip.c:126]: could not connect to lwm2m.blahblah.com:5684
2023-08-03 16:01:13.543654 TRACE [anjay] [/cygdrive/c/mydev/pr_anjay/Anjay/src/core/servers/anjay_activate.c:279]: could not initialize sockets for SSID 1
2023-08-03 16:01:13.543879 DEBUG [anjay] [/cygdrive/c/mydev/pr_anjay/Anjay/src/core/servers/anjay_activate.c:197]: Non-Bootstrap Server 1: not reachable.

The most serious error seem to be this one:

/Anjay/deps/avs_commons/src/net/compat/posix/avs_compat_addrinfo.c:242]: 
getaddrinfo() error: Name or service not known; family == (avs_net_af_t) 2

Also not sure why it is logged as DEBUG when it's clearly an ERROR.

Related dependency code here:

https://github.com/AVSystem/avs_commons/blob/a227958d88cd9fc689a90d2c336ee0eaa1daf17f/src/net/compat/posix/avs_compat_addrinfo.c#L239-L270

Click to see code snippet From: `avs_compat_addrinfo.c` ```c avs_net_addrinfo_t *avs_net_addrinfo_resolve_ex( avs_net_socket_type_t socket_type, avs_net_af_t family, const char *host, const char *port_str, int flags, const avs_net_resolved_endpoint_t *preferred_endpoint) { if (avs_is_err(_avs_net_ensure_global_state())) { LOG(ERROR, _("avs_net global state initialization error")); return NULL; } struct addrinfo hint; memset((void *) &hint, 0, sizeof(hint)); hint.ai_family = get_native_af(family); if (family != AVS_NET_AF_UNSPEC && hint.ai_family == AF_UNSPEC) { LOG(DEBUG, _("Unsupported avs_net_af_t: ") "%d", (int) family); return NULL; } if (!(flags & AVS_NET_ADDRINFO_RESOLVE_F_NOADDRCONFIG)) { hint.ai_flags |= AI_ADDRCONFIG; } if (flags & AVS_NET_ADDRINFO_RESOLVE_F_PASSIVE) { hint.ai_flags |= AI_PASSIVE; } // some getaddrinfo() implementations interpret port 0 as invalid, // so we use our own port parsing uint16_t port; if (port_from_string(&port, port_str)) { LOG(ERROR, _("Invalid port: ") "%s", port_str); return NULL; } avs_net_addrinfo_t *ctx = (avs_net_addrinfo_t *) avs_calloc(1, sizeof(avs_net_addrinfo_t)); if (!ctx) { LOG(ERROR, _("Out of memory")); return NULL; } # ifdef WITH_AVS_V4MAPPED if (flags & AVS_NET_ADDRINFO_RESOLVE_F_V4MAPPED) { ctx->v4mapped = true; if (family == AVS_NET_AF_INET6) { hint.ai_family = AF_UNSPEC; } } # endif hint.ai_socktype = _avs_net_get_socket_type(socket_type); if (!host || !*host) { switch (family) { case AVS_NET_AF_INET4: host = "0.0.0.0"; break; case AVS_NET_AF_INET6: host = "::"; break; default: host = ""; } } int error = getaddrinfo(host, NULL, &hint, &ctx->results); if (error) { # ifdef AVS_COMMONS_NET_POSIX_AVS_SOCKET_HAVE_GAI_STRERROR LOG(DEBUG, _("getaddrinfo() error: ") "%s" _( "; family == (avs_net_af_t) ") "%d", gai_strerror(error), (int) family); # else LOG(DEBUG, _("getaddrinfo() error: ") "%d" _( "; family == (avs_net_af_t) ") "%d", error, (int) family); # endif avs_net_addrinfo_delete(&ctx); return NULL; } else { update_ports(ctx->results, port); unsigned seed = (unsigned) avs_time_real_now().since_real_epoch.seconds; struct addrinfo *preferred = NULL; if (preferred_endpoint) { preferred = detach_preferred(&ctx->results, preferred_endpoint->data.buf, preferred_endpoint->size); } randomize_addrinfo_list(&ctx->results, &seed); if (preferred) { preferred->ai_next = ctx->results; ctx->results = preferred; } ctx->to_send = ctx->results; return ctx; } } ```

Any idea what's going on here?

kFYatek commented 1 year ago

@eabase

The most serious error seem to be this one:

/Anjay/deps/avs_commons/src/net/compat/posix/avs_compat_addrinfo.c:242]: 
getaddrinfo() error: Name or service not known; family == (avs_net_af_t) 2

Also not sure why it is logged as DEBUG when it's clearly an ERROR.

It's an error only on a very low-level layer, but not really an error in the big picture perspective. Anjay is attempting to resolve the hostname as an IPv6 address (that's what (avs_net_af_t) 2 is referring to), and this fails. It then proceeds to resolve the same hostname as IPv4, and that apparently succeeds. That's also why it's logged on the DEBUG level.

The most relevant log message seems to be this one:

2023-08-03 16:01:13.543408 ERROR [avs_net] [/cygdrive/c/mydev/pr_anjay/Anjay/deps/avs_commons/src/net/mbedtls/avs_mbedtls_socket.c:1311]: handshake failed: -26624

I agree that those handshake errors can be cryptic, but the -26624 code refers to a timeout. You can find that by grepping Mbed TLS sources:

kfyatek@kfyatek-pc:/usr/include/mbedtls$ printf '0x%x\n' 26624
0x6800
kfyatek@kfyatek-pc:/usr/include/mbedtls$ grep -r 0x6800
ssl.h:#define MBEDTLS_ERR_SSL_TIMEOUT                           -0x6800

Handshake timeout may mean, well, literally anything ;) but in my experience, the most probable cause is that the DTLS credentials are not recognized by the server. Please double-check that the identity and key are matching on both sides. Please also check that they are in the same format on both sides - since you're using --key-as-string, please make sure that on the server side, the key is also treated as a literal string, as opposed to e.g. encoded as hex or base64.

What kind of server are you trying to connect to? Is it an installation of Coiote DM, or maybe Leshan? We may be able to provide hints on configuring the server in that case.

If you're sure that everything matches and you're still getting this error, then please provide a packet capture (PCAP). It might be helpful in further debugging the issue.

eabase commented 1 year ago

Hi @kFYatek

Thank you for that explanation. 💯 Indeed the client was also compiled with IPv6.

From that it seem that some corporate firewall is blocking connections to the coap/s ports, as we're using a CoioteDM provided server.