emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.62k stars 3.28k forks source link

WasmFS: Original file case is not preserved when using with CASE_INSENSITIVE_FS #17993

Closed mere-human closed 1 year ago

mere-human commented 1 year ago

When creating a file or a directory, it's letter case is not preserved for subsequent calls such as readdir() or getcwd(). Works as expected when not using WASMFS.

Version of emscripten/emsdk:

emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.23 (5ae63ce7dd955df449cb419bfe5afc51d1bd57f2)
clang version 16.0.0 (https://github.com/llvm/llvm-project 8b587113b746f31b63fd6473083df78cef30a72e)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /Users/user/src/emsdk/upstream/bin

Failing code:

#include <assert.h>
#include <dirent.h>
#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>

void readdir(const char *dname)
{
    printf("Files in '%s': ", dname);
    DIR *d = opendir(dname);
    if (d)
    {
        struct dirent *dir;
        while ((dir = readdir(d)) != NULL)
        {
            if (dir->d_type & (DT_DIR | DT_REG))
                printf("%s ", dir->d_name);
        }
        closedir(d);
    }
    printf("\n");
}

int main()
{
    FILE *fp = fopen("Test.txt", "wt");
    assert(fp);
    readdir(".");
    assert(mkdir("Subdir", 0777) == 0);
    assert(chdir("Subdir") == 0);
    char buf[200] = {};
    assert(getcwd(buf, sizeof(buf)) != nullptr);
    printf("cwd: %s\n", buf);
    return 0;
}

Full link command and output with -v appended:

rm -rf build
mkdir -p build
export FLAG_WASMFS=-sWASMFS=1
em++ -v -sCASE_INSENSITIVE_FS=1 $FLAG_WASMFS -std=c++11 main.cpp -o build/common.html
ret=$?
if [ $ret -eq 0 ]; then
    echo "ok"
    emrun build/common.html --browser chrome
fi;

Build output:

Click to expand ``` "/Users/user/src/emsdk/upstream/bin/clang" --version "/Users/user/src/emsdk/upstream/bin/clang++" -target wasm32-unknown-emscripten -fignore-exceptions -fvisibility=default -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -DEMSCRIPTEN -I/Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/SDL --sysroot=/Users/user/src/emsdk/upstream/emscripten/cache/sysroot -Xclang -iwithsysroot/include/compat -v -std=c++11 main.cpp -c -o /var/folders/c_/szdjrjm12g5_9vh8ybcndhmm0000gp/T/emscripten_temp_62rwmxdh/main_0.o clang version 16.0.0 (https://github.com/llvm/llvm-project 8b587113b746f31b63fd6473083df78cef30a72e) Target: wasm32-unknown-emscripten Thread model: posix InstalledDir: /Users/user/src/emsdk/upstream/bin (in-process) "/Users/user/src/emsdk/upstream/bin/clang-16" -cc1 -triple wasm32-unknown-emscripten -emit-obj -mrelax-all --mrelax-relocations -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name main.cpp -mrelocation-model static -mframe-pointer=none -ffp-contract=on -fno-rounding-math -mconstructor-aliases -target-cpu generic -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -v -fcoverage-compilation-dir=/Users/user/src/emdemo/common -resource-dir /Users/user/src/emsdk/upstream/lib/clang/16.0.0 -D EMSCRIPTEN -I /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/SDL -isysroot /Users/user/src/emsdk/upstream/emscripten/cache/sysroot -internal-isystem /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten/c++/v1 -internal-isystem /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/c++/v1 -internal-isystem /Users/user/src/emsdk/upstream/lib/clang/16.0.0/include -internal-isystem /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten -internal-isystem /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include -std=c++11 -fdeprecated-macro -fdebug-compilation-dir=/Users/user/src/emdemo/common -ferror-limit 19 -fvisibility=default -fgnuc-version=4.2.1 -fcxx-exceptions -fignore-exceptions -fexceptions -fcolor-diagnostics -iwithsysroot/include/compat -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -o /var/folders/c_/szdjrjm12g5_9vh8ybcndhmm0000gp/T/emscripten_temp_62rwmxdh/main_0.o -x c++ main.cpp clang -cc1 version 16.0.0 based upon LLVM 16.0.0git default target x86_64-apple-darwin21.6.0 ignoring nonexistent directory "/Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten/c++/v1" ignoring nonexistent directory "/Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten" #include "..." search starts here: #include <...> search starts here: /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/SDL /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/compat /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/c++/v1 /Users/user/src/emsdk/upstream/lib/clang/16.0.0/include /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include End of search list. "/Users/user/src/emsdk/upstream/bin/wasm-ld" -o build/common.wasm /var/folders/c_/szdjrjm12g5_9vh8ybcndhmm0000gp/T/emscripten_temp_62rwmxdh/main_0.o -L/Users/user/src/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten --whole-archive -lwasmfs-debug-icase --no-whole-archive -lGL -lal -lhtml5 -lstubs-debug -lnoexit -lc-debug -ldlmalloc -lcompiler_rt -lc++-noexcept -lc++abi-noexcept -lsockets -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr --import-undefined --strip-debug --export-if-defined=main --export-if-defined=__start_em_asm --export-if-defined=__stop_em_asm --export-if-defined=__start_em_js --export-if-defined=__stop_em_js --export-if-defined=__main_argc_argv --export-if-defined=fflush --export=emscripten_stack_get_end --export=emscripten_stack_get_free --export=emscripten_stack_get_base --export=emscripten_stack_init --export=_wasmfs_read_file --export=stackSave --export=stackRestore --export=stackAlloc --export=__wasm_call_ctors --export=__errno_location --export=__get_temp_ret --export=__set_temp_ret --export=malloc --export=free --export-table -z stack-size=5242880 --initial-memory=16777216 --no-entry --max-memory=16777216 --global-base=1024 "/Users/user/src/emsdk/upstream/bin/wasm-emscripten-finalize" --dyncalls-i64 --pass-arg=legalize-js-interface-exported-helpers build/common.wasm -o build/common.wasm --detect-features "/Users/user/src/emsdk/node/14.18.2_64bit/bin/node" /Users/user/src/emsdk/upstream/emscripten/src/compiler.js /var/folders/c_/szdjrjm12g5_9vh8ybcndhmm0000gp/T/tmp97ua1gat.json user@C02F71SCMD6R common % source build.sh "/Users/user/src/emsdk/upstream/bin/clang" --version "/Users/user/src/emsdk/upstream/bin/clang++" -target wasm32-unknown-emscripten -fignore-exceptions -fvisibility=default -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -DEMSCRIPTEN -I/Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/SDL --sysroot=/Users/user/src/emsdk/upstream/emscripten/cache/sysroot -Xclang -iwithsysroot/include/compat -v -std=c++11 main.cpp -c -o /var/folders/c_/szdjrjm12g5_9vh8ybcndhmm0000gp/T/emscripten_temp_ogj1duyy/main_0.o clang version 16.0.0 (https://github.com/llvm/llvm-project 8b587113b746f31b63fd6473083df78cef30a72e) Target: wasm32-unknown-emscripten Thread model: posix InstalledDir: /Users/user/src/emsdk/upstream/bin (in-process) "/Users/user/src/emsdk/upstream/bin/clang-16" -cc1 -triple wasm32-unknown-emscripten -emit-obj -mrelax-all --mrelax-relocations -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name main.cpp -mrelocation-model static -mframe-pointer=none -ffp-contract=on -fno-rounding-math -mconstructor-aliases -target-cpu generic -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -v -fcoverage-compilation-dir=/Users/user/src/emdemo/common -resource-dir /Users/user/src/emsdk/upstream/lib/clang/16.0.0 -D EMSCRIPTEN -I /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/SDL -isysroot /Users/user/src/emsdk/upstream/emscripten/cache/sysroot -internal-isystem /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten/c++/v1 -internal-isystem /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/c++/v1 -internal-isystem /Users/user/src/emsdk/upstream/lib/clang/16.0.0/include -internal-isystem /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten -internal-isystem /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include -std=c++11 -fdeprecated-macro -fdebug-compilation-dir=/Users/user/src/emdemo/common -ferror-limit 19 -fvisibility=default -fgnuc-version=4.2.1 -fcxx-exceptions -fignore-exceptions -fexceptions -fcolor-diagnostics -iwithsysroot/include/compat -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -o /var/folders/c_/szdjrjm12g5_9vh8ybcndhmm0000gp/T/emscripten_temp_ogj1duyy/main_0.o -x c++ main.cpp clang -cc1 version 16.0.0 based upon LLVM 16.0.0git default target x86_64-apple-darwin21.6.0 ignoring nonexistent directory "/Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten/c++/v1" ignoring nonexistent directory "/Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/wasm32-emscripten" #include "..." search starts here: #include <...> search starts here: /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/SDL /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/compat /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include/c++/v1 /Users/user/src/emsdk/upstream/lib/clang/16.0.0/include /Users/user/src/emsdk/upstream/emscripten/cache/sysroot/include End of search list. "/Users/user/src/emsdk/upstream/bin/wasm-ld" -o build/common.wasm /var/folders/c_/szdjrjm12g5_9vh8ybcndhmm0000gp/T/emscripten_temp_ogj1duyy/main_0.o -L/Users/user/src/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten --whole-archive -lwasmfs-debug-icase --no-whole-archive -lGL -lal -lhtml5 -lstubs-debug -lnoexit -lc-debug -ldlmalloc -lcompiler_rt -lc++-noexcept -lc++abi-noexcept -lsockets -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr --import-undefined --strip-debug --export-if-defined=main --export-if-defined=__start_em_asm --export-if-defined=__stop_em_asm --export-if-defined=__start_em_js --export-if-defined=__stop_em_js --export-if-defined=__main_argc_argv --export-if-defined=fflush --export=emscripten_stack_get_end --export=emscripten_stack_get_free --export=emscripten_stack_get_base --export=emscripten_stack_init --export=_wasmfs_read_file --export=stackSave --export=stackRestore --export=stackAlloc --export=__wasm_call_ctors --export=__errno_location --export=__get_temp_ret --export=__set_temp_ret --export=malloc --export=free --export-table -z stack-size=5242880 --initial-memory=16777216 --no-entry --max-memory=16777216 --global-base=1024 "/Users/user/src/emsdk/upstream/bin/wasm-emscripten-finalize" --dyncalls-i64 --pass-arg=legalize-js-interface-exported-helpers build/common.wasm -o build/common.wasm --detect-features "/Users/user/src/emsdk/node/14.18.2_64bit/bin/node" /Users/user/src/emsdk/upstream/emscripten/src/compiler.js /var/folders/c_/szdjrjm12g5_9vh8ybcndhmm0000gp/T/tmptsr6qoom.json "/Users/user/src/emsdk/upstream/bin/llvm-objcopy" build/common.wasm build/common.wasm --remove-section=.debug* --remove-section=producers "/Users/user/src/emsdk/node/14.18.2_64bit/bin/node" /Users/user/src/emsdk/upstream/emscripten/tools/preprocessor.js /var/folders/c_/szdjrjm12g5_9vh8ybcndhmm0000gp/T/emscripten_temp_ogj1duyy/settings.js shell.html ok ```

Program output:

Files in '.': . .. dev tmp test.txt 
cwd: /subdir

Expected output: (when building without WASMFS):

Files in '.': . .. tmp home dev proc Test.txt 
cwd: /Subdir
mere-human commented 1 year ago

Related: https://github.com/emscripten-core/emscripten/issues/17079

mere-human commented 1 year ago

Hi @tlively,

Which way do you think is the best to solve this?

  1. Don't inherit IgnoreCaseDirectory from MemoryDirectory. Use similar logic but with ChildEntry containing an extra member for original name. Problem: MemoryDirectory code is not reused. Could be partially solved by extracting some methods.
  2. Add a map to IgnoreCaseDirectory which maps normalized name to original name. Problem: excess 48 bytes for each directory. Draft is here https://github.com/emscripten-core/emscripten/pull/18005
  3. Reusing MemoryDirectory::entries (base class of IgnoreCaseDirectory) to store original names there. Problems:
    1. Fragile code. Need to override every Directory function. No way to enforce override for the new methods added.
    2. Every operation on MemoryDirectory::entries will require ignore case comparison which may be slower.
  4. Add an extra std::vector<std::string> member to IgnoreCaseDirectory with the same size (and index) as in entries. Use that to store the original name. Problems: Complicated logic. Draft is here https://github.com/emscripten-core/emscripten/compare/main...mere-human:emscripten:wasmfs_icase_preserve2

Any ideas?

tlively commented 1 year ago

My initial thought is that option (1) seems the simplest. There is definitely some duplicated code with that option, but it has the fewest interdependencies with the rest of the code, so I think it will be the easiest to understand and maintain in the end. WDYT?

mere-human commented 1 year ago

Ok, thanks. I thought the same. That's why I put it first. I'll prepare a new PR then.