emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.66k stars 3.29k forks source link

getaddrinfo behaves differently with loopback address #22633

Open 5d-jh opened 1 day ago

5d-jh commented 1 day ago

During building librdkafka, I encountered Name does not resolve error. After some research, I found getaddrinfo behaves differencly with native build.

When loopback address is given, I expect loopback as is instead of local ip address.

Version of emscripten/emsdk:

emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.67 (2c4461731b8c0bffbdecebc0030e0489712f9886)
clang version 20.0.0git (https:/github.com/llvm/llvm-project b9198a17315757dc0c2e831c9df0498dcab55285)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /Users/jeonghyun.lee/emsdk/upstream/bin

Failing command line in full:

emcc addrinfo.c -o addrinfo.js -sEXIT_RUNTIME=1 -v && node ./addrinfo.js

outputs

IPv4 address: 172.29.1.0 ((null))

gcc addrinfo.c -o addrinfo && ./addrinfo

outputs

IPv6 address: ::1 ((null))
IPv4 address: 127.0.0.1 ((null))

Full link command and output with -v appended:

 /Users/jeonghyun.lee/emsdk/upstream/bin/clang -target wasm32-unknown-emscripten -fignore-exceptions -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr --sysroot=/Users/jeonghyun.lee/emsdk/upstream/emscripten/cache/sysroot -DEMSCRIPTEN -Werror=implicit-function-declaration -Xclang -iwithsysroot/include/fakesdl -Xclang -iwithsysroot/include/compat -v addrinfo.c -c -o /var/folders/9l/btg_dvc90w76gpcfdrvdyj8w0000gp/T/emscripten_temp_abek1lax/addrinfo_0.o
clang version 20.0.0git (https:/github.com/llvm/llvm-project b9198a17315757dc0c2e831c9df0498dcab55285)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /Users/jeonghyun.lee/emsdk/upstream/bin
 (in-process)
 "/Users/jeonghyun.lee/emsdk/upstream/bin/clang-20" -cc1 -triple wasm32-unknown-emscripten -emit-obj -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name addrinfo.c -mrelocation-model static -mframe-pointer=none -ffp-contract=on -fno-rounding-math -mconstructor-aliases -target-cpu generic -fvisibility=hidden -debugger-tuning=gdb -fdebug-compilation-dir=/Users/jeonghyun.lee/librdkafka -v -fcoverage-compilation-dir=/Users/jeonghyun.lee/librdkafka -resource-dir /Users/jeonghyun.lee/emsdk/upstream/lib/clang/20 -D EMSCRIPTEN -isysroot /Users/jeonghyun.lee/emsdk/upstream/emscripten/cache/sysroot -I/opt/homebrew/include -cxx-isystem /opt/homebrew/Cellar/librdkafka/2.3.0/include/librdkafka -internal-isystem /Users/jeonghyun.lee/emsdk/upstream/lib/clang/20/include -internal-isystem /Users/jeonghyun.lee/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten -internal-isystem /Users/jeonghyun.lee/emsdk/upstream/emscripten/cache/sysroot/include -Werror=implicit-function-declaration -ferror-limit 19 -fgnuc-version=4.2.1 -fskip-odr-check-in-gmf -fignore-exceptions -iwithsysroot/include/fakesdl -iwithsysroot/include/compat -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -o /var/folders/9l/btg_dvc90w76gpcfdrvdyj8w0000gp/T/emscripten_temp_abek1lax/addrinfo_0.o -x c addrinfo.c
clang -cc1 version 20.0.0git based upon LLVM 20.0.0git default target x86_64-apple-darwin23.6.0
ignoring nonexistent directory "/Users/jeonghyun.lee/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten"
#include "..." search starts here:
#include <...> search starts here:
 /opt/homebrew/include
 /Users/jeonghyun.lee/emsdk/upstream/emscripten/cache/sysroot/include/fakesdl
 /Users/jeonghyun.lee/emsdk/upstream/emscripten/cache/sysroot/include/compat
 /Users/jeonghyun.lee/emsdk/upstream/lib/clang/20/include
 /Users/jeonghyun.lee/emsdk/upstream/emscripten/cache/sysroot/include
End of search list.
 /Users/jeonghyun.lee/emsdk/upstream/bin/clang --version
 /Users/jeonghyun.lee/emsdk/upstream/bin/wasm-ld -o addrinfo.wasm /var/folders/9l/btg_dvc90w76gpcfdrvdyj8w0000gp/T/emscripten_temp_abek1lax/addrinfo_0.o -L/Users/jeonghyun.lee/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten -lGL-getprocaddr -lal -lhtml5 -lstubs-debug -lc-debug -ldlmalloc -lcompiler_rt -lc++-noexcept -lc++abi-debug-noexcept -lsockets -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr /var/folders/9l/btg_dvc90w76gpcfdrvdyj8w0000gp/T/tmphv7gq7j5libemscripten_js_symbols.so --strip-debug --export=emscripten_stack_get_end --export=emscripten_stack_get_free --export=emscripten_stack_get_base --export=emscripten_stack_get_current --export=emscripten_stack_init --export=_emscripten_stack_alloc --export=__get_temp_ret --export=__set_temp_ret --export=__funcs_on_exit --export=__wasm_call_ctors --export=_emscripten_stack_restore --export-if-defined=__start_em_asm --export-if-defined=__stop_em_asm --export-if-defined=__start_em_lib_deps --export-if-defined=__stop_em_lib_deps --export-if-defined=__start_em_js --export-if-defined=__stop_em_js --export-if-defined=main --export-if-defined=__main_argc_argv --export-if-defined=fflush --export-table -z stack-size=65536 --no-growable-memory --initial-heap=16777216 --no-entry --stack-first --table-base=1
 /Users/jeonghyun.lee/emsdk/upstream/bin/llvm-objcopy addrinfo.wasm addrinfo.wasm --remove-section=.debug* --remove-section=producers
 /Users/jeonghyun.lee/emsdk/upstream/bin/wasm-emscripten-finalize --dyncalls-i64 --pass-arg=legalize-js-interface-exported-helpers addrinfo.wasm -o addrinfo.wasm --detect-features
 /Users/jeonghyun.lee/emsdk/node/18.20.3_64bit/bin/node /Users/jeonghyun.lee/emsdk/upstream/emscripten/src/compiler.mjs /var/folders/9l/btg_dvc90w76gpcfdrvdyj8w0000gp/T/tmp8du07sbd.json
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <arpa/inet.h>

struct addrinfo *get_addr();

int main(void) {
   get_addr();
}

struct addrinfo *get_addr() {
    struct addrinfo hints, *res, *result;
    int errcode;
      char addrstr[100];
      void *ptr;

    memset(&hints, 0, sizeof(hints));
    hints.ai_flags    = 1024;
    hints.ai_family   = 0;
    hints.ai_socktype = SOCK_STREAM;
    hints.ai_protocol = IPPROTO_TCP;
    //
    printf("ADDRCONFIG %d\n", AI_ADDRCONFIG);

    char *host = "localhost";

    int status = getaddrinfo(host, NULL, &hints, &result);

    printf ("Host: %s\n", host);
    printf("Status? %d\n", status);
    res = result;
    while (res)
        {
          inet_ntop (res->ai_family, res->ai_addr->sa_data, addrstr, 100);

          switch (res->ai_family)
            {
            case AF_INET:
              ptr = &((struct sockaddr_in *) res->ai_addr)->sin_addr;
              break;
            case AF_INET6:
              ptr = &((struct sockaddr_in6 *) res->ai_addr)->sin6_addr;
              break;
            }
          inet_ntop (res->ai_family, ptr, addrstr, 100);
          printf ("IPv%d address: %s (%s)\n", res->ai_family == PF_INET6 ? 6 : 4,
                  addrstr, res->ai_canonname);
          res = res->ai_next;
        }

        freeaddrinfo(result);

        return res;
}
sbc100 commented 1 day ago

On the web we cannot actually do DNS so we have a kind of fake implementation here: https://github.com/emscripten-core/emscripten/blob/ef0efd234a728f5b1dc1131e118566cad97bcdf1/src/library.js#L908-L920

We could add a special case for "localhost" here, and maybe we should do that? Would like have time to send a PR for that perhaps.

However, I'm not sure the program in question would get much futher in that case since we don't have any actually TCP/UDP networking support either. What is your program hoping to do with the 127.0.0.1 address?

sbc100 commented 1 day ago

Fix is in #22641

5d-jh commented 1 day ago

However, I'm not sure the program in question would get much futher in that case since we don't have any actually TCP/UDP networking support either. What is your program hoping to do with the 127.0.0.1 address?

I was trying to use librdkafka on Node.js. And for portability wise, I tried to compile it into WebAssembly instead of node-gyp. 127.0.0.1 was needed in the middle of test if librdkafka wasm binary sends data to kafka as expected.

But after heard that TCP/UDP capabilities are not available, compiling it into wasm seems no-go after all. (Proxying is supported but huge disadvantage for performance wise)

Aside from dns resolution, is what I'm understanding correct?

Also tried with -sPURE_WASI option but it seems very experimental(met bunch of runtime errors).

sbc100 commented 18 hours ago

emscripten does not currently support any kind TCP/UDP sockets on node. However, do we have a feature called NODERAWFS that exposed raw node filesystem stuff. This could be extended I imagine fairly easily to include realy TCP/UDP support under node. Would you be interested in adding such support perhaps?