Closed ngxson closed 6 months ago
Another idea would be patching mmap
function, so that it no longer copy the memory:
TODO: also have a look on wasmfs implementation (although it's unusable for now, because it does not support MAP_SHARED
): https://github.com/emscripten-core/emscripten/blob/2bc5e3156f07e603bc4f3580cf84c038ea99b2df/system/lib/libc/emscripten_mmap.c#L105
Seems like heapfs is the best that we can do.
Another idea (more native implementation) is to mmap directly to file-on-disk, but clearly this is not supported by browser (clearly, this is risk for security)
MemFS implementation uses
std::vector
which is quite memory-intensive:https://github.com/emscripten-core/emscripten/blob/799a1cb35b3c6065ba8b2e519e589944c0057f6d/system/lib/wasmfs/backends/memory_backend.cpp#L16-L27
A better way to load model into llama.cpp is to pass the buffer directly from JS to llama.cpp. This likely requires modifying code inside llama.cpp and ggml