mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.1k stars 1.57k forks source link

[Question] Cannot compile custom model to work on web browser #2485

Closed lawofcycles closed 3 months ago

lawofcycles commented 5 months ago

❓ General Questions

Hello, I am trying to convert elyza/ELYZA-japanese-Llama-2-13b-fast-instruct based on llama2 to compille to work on wasm,but I cannot make it to work whatever I do.

I followed the instructions here (https://llm.mlc.ai/docs/deploy/javascript.html#bring-your-own-model-library).

Reproduction Steps

Environment

cat /etc/os-release 
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
python --version
Python 3.11.9
nvidia-smi
Sat Jun  1 16:05:19 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.08             Driver Version: 535.161.08   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A10G                    On  | 00000000:00:1E.0 Off |                    0 |
|  0%   24C    P8              16W / 300W |      0MiB / 23028MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
(mlc-chat-venv) ubuntu@ip-10-0-4-106:~$ 
tvm ```shell python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))" USE_NVTX: OFF USE_GTEST: AUTO SUMMARIZE: OFF TVM_DEBUG_WITH_ABI_CHANGE: OFF USE_IOS_RPC: OFF USE_MSC: OFF USE_ETHOSU: CUDA_VERSION: 12.2 USE_LIBBACKTRACE: AUTO DLPACK_PATH: 3rdparty/dlpack/include USE_TENSORRT_CODEGEN: OFF USE_THRUST: ON USE_TARGET_ONNX: OFF USE_AOT_EXECUTOR: ON BUILD_DUMMY_LIBTVM: OFF USE_CUDNN: OFF USE_TENSORRT_RUNTIME: OFF USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR: OFF USE_CCACHE: AUTO USE_ARM_COMPUTE_LIB: OFF USE_CPP_RTVM: USE_OPENCL_GTEST: /path/to/opencl/gtest TVM_LOG_BEFORE_THROW: OFF USE_MKL: OFF USE_PT_TVMDSOOP: OFF MLIR_VERSION: NOT-FOUND USE_CLML: OFF USE_STACKVM_RUNTIME: OFF USE_GRAPH_EXECUTOR_CUDA_GRAPH: OFF ROCM_PATH: /opt/rocm USE_DNNL: OFF USE_MSCCL: OFF USE_VITIS_AI: OFF USE_MLIR: OFF USE_RCCL: OFF USE_LLVM: llvm-config --ignore-libllvm --link-static USE_VERILATOR: OFF USE_TF_TVMDSOOP: OFF USE_THREADS: ON USE_MSVC_MT: OFF BACKTRACE_ON_SEGFAULT: OFF USE_GRAPH_EXECUTOR: ON USE_NCCL: ON USE_ROCBLAS: OFF GIT_COMMIT_HASH: 759c916dc845ccf32750c9e585b5ad99d6e2f4e4 USE_VULKAN: ON USE_RUST_EXT: OFF USE_CUTLASS: ON USE_CPP_RPC: OFF USE_HEXAGON: OFF USE_CUSTOM_LOGGING: OFF USE_UMA: OFF USE_FALLBACK_STL_MAP: OFF USE_SORT: ON USE_RTTI: ON GIT_COMMIT_TIME: 2024-05-24 13:56:06 -0400 USE_HEXAGON_SDK: /path/to/sdk USE_BLAS: none USE_ETHOSN: OFF USE_LIBTORCH: OFF USE_RANDOM: ON USE_CUDA: ON USE_COREML: OFF USE_AMX: OFF BUILD_STATIC_RUNTIME: OFF USE_CMSISNN: OFF USE_KHRONOS_SPIRV: OFF USE_CLML_GRAPH_EXECUTOR: OFF USE_TFLITE: OFF USE_HEXAGON_GTEST: /path/to/hexagon/gtest PICOJSON_PATH: 3rdparty/picojson USE_OPENCL_ENABLE_HOST_PTR: OFF INSTALL_DEV: OFF USE_PROFILER: ON USE_NNPACK: OFF LLVM_VERSION: 15.0.7 USE_MRVL: OFF USE_OPENCL: OFF COMPILER_RT_PATH: 3rdparty/compiler-rt RANG_PATH: 3rdparty/rang/include USE_SPIRV_KHR_INTEGER_DOT_PRODUCT: OFF USE_OPENMP: OFF USE_BNNS: OFF USE_FLASHINFER: ON USE_CUBLAS: ON USE_METAL: OFF USE_MICRO_STANDALONE_RUNTIME: OFF USE_HEXAGON_EXTERNAL_LIBS: OFF USE_ALTERNATIVE_LINKER: AUTO USE_BYODT_POSIT: OFF USE_HEXAGON_RPC: OFF USE_MICRO: OFF DMLC_PATH: 3rdparty/dmlc-core/include INDEX_DEFAULT_I64: ON USE_RELAY_DEBUG: OFF USE_RPC: ON USE_TENSORFLOW_PATH: none TVM_CLML_VERSION: USE_MIOPEN: OFF USE_ROCM: OFF USE_PAPI: OFF USE_CURAND: OFF TVM_CXX_COMPILER_PATH: /opt/rh/gcc-toolset-11/root/usr/bin/c++ HIDE_PRIVATE_SYMBOLS: ON ```
conda ```shell (mlc-chat-venv) ubuntu@ip-10-0-4-106:~/mlc-llm$ conda list -n mlc-chat-venv # packages in environment at /home/ubuntu/anaconda3/envs/mlc-chat-venv: # # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge annotated-types 0.7.0 pypi_0 pypi anyio 4.3.0 pypi_0 pypi attrs 23.2.0 pypi_0 pypi binutils_impl_linux-64 2.40 ha1999f0_1 conda-forge bzip2 1.0.8 hd590300_5 conda-forge c-ares 1.28.1 hd590300_0 conda-forge ca-certificates 2024.2.2 hbcca054_0 conda-forge certifi 2024.2.2 pypi_0 pypi charset-normalizer 3.3.2 pypi_0 pypi click 8.1.7 pypi_0 pypi cloudpickle 3.0.0 pypi_0 pypi cmake 3.29.3 h91dbaaa_0 conda-forge decorator 5.1.1 pypi_0 pypi dnspython 2.6.1 pypi_0 pypi email-validator 2.1.1 pypi_0 pypi fastapi 0.111.0 pypi_0 pypi fastapi-cli 0.0.4 pypi_0 pypi filelock 3.14.0 pypi_0 pypi fsspec 2024.5.0 pypi_0 pypi gcc_impl_linux-64 13.2.0 h9eb54c0_7 conda-forge git 2.45.1 pl5321hef9f9f3_0 conda-forge h11 0.14.0 pypi_0 pypi httpcore 1.0.5 pypi_0 pypi httptools 0.6.1 pypi_0 pypi httpx 0.27.0 pypi_0 pypi idna 3.7 pypi_0 pypi jinja2 3.1.4 pypi_0 pypi kernel-headers_linux-64 2.6.32 he073ed8_17 conda-forge keyutils 1.6.1 h166bdaf_0 conda-forge krb5 1.21.2 h659d440_0 conda-forge ld_impl_linux-64 2.40 hf3520f5_1 conda-forge libcurl 8.8.0 hca28451_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 hd590300_2 conda-forge libexpat 2.6.2 h59595ed_0 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-devel_linux-64 13.2.0 hceb6213_107 conda-forge libgcc-ng 13.2.0 h77fa898_7 conda-forge libgomp 13.2.0 h77fa898_7 conda-forge libiconv 1.17 hd590300_2 conda-forge libnghttp2 1.58.0 h47da74e_1 conda-forge libnsl 2.0.1 hd590300_0 conda-forge libsanitizer 13.2.0 h6ddb7a1_7 conda-forge libsqlite 3.45.3 h2797004_0 conda-forge libssh2 1.11.0 h0841786_0 conda-forge libstdcxx-ng 13.2.0 hc0a3c3a_7 conda-forge libuuid 2.38.1 h0b41bf4_0 conda-forge libuv 1.48.0 hd590300_0 conda-forge libxcrypt 4.4.36 hd590300_1 conda-forge libzlib 1.2.13 hd590300_5 conda-forge markdown-it-py 3.0.0 pypi_0 pypi markupsafe 2.1.5 pypi_0 pypi mdurl 0.1.2 pypi_0 pypi ml-dtypes 0.4.0 pypi_0 pypi mlc-ai-nightly-cu122 0.15.dev380 pypi_0 pypi mpmath 1.3.0 pypi_0 pypi ncurses 6.5 h59595ed_0 conda-forge networkx 3.3 pypi_0 pypi numpy 2.0.0rc2 pypi_0 pypi nvidia-cublas-cu12 12.1.3.1 pypi_0 pypi nvidia-cuda-cupti-cu12 12.1.105 pypi_0 pypi nvidia-cuda-nvrtc-cu12 12.1.105 pypi_0 pypi nvidia-cuda-runtime-cu12 12.1.105 pypi_0 pypi nvidia-cudnn-cu12 8.9.2.26 pypi_0 pypi nvidia-cufft-cu12 11.0.2.54 pypi_0 pypi nvidia-curand-cu12 10.3.2.106 pypi_0 pypi nvidia-cusolver-cu12 11.4.5.107 pypi_0 pypi nvidia-cusparse-cu12 12.1.0.106 pypi_0 pypi nvidia-nccl-cu12 2.20.5 pypi_0 pypi nvidia-nvjitlink-cu12 12.5.40 pypi_0 pypi nvidia-nvtx-cu12 12.1.105 pypi_0 pypi openssl 3.3.0 h4ab18f5_3 conda-forge orjson 3.10.3 pypi_0 pypi pcre2 10.43 hcad00b1_0 conda-forge perl 5.32.1 7_hd590300_perl5 conda-forge pip 24.0 pyhd8ed1ab_0 conda-forge prompt-toolkit 3.0.43 pypi_0 pypi psutil 5.9.8 pypi_0 pypi pydantic 2.7.1 pypi_0 pypi pydantic-core 2.18.2 pypi_0 pypi pygments 2.18.0 pypi_0 pypi python 3.11.9 hb806964_0_cpython conda-forge python-dotenv 1.0.1 pypi_0 pypi python-multipart 0.0.9 pypi_0 pypi pyyaml 6.0.1 pypi_0 pypi readline 8.2 h8228510_1 conda-forge requests 2.32.2 pypi_0 pypi rhash 1.4.4 hd590300_0 conda-forge rich 13.7.1 pypi_0 pypi rust 1.77.2 h70c747d_1 conda-forge rust-std-x86_64-unknown-linux-gnu 1.77.2 h2c6d0dc_1 conda-forge scipy 1.13.1 pypi_0 pypi setuptools 70.0.0 pyhd8ed1ab_0 conda-forge shellingham 1.5.4 pypi_0 pypi shortuuid 1.0.13 pypi_0 pypi sniffio 1.3.1 pypi_0 pypi starlette 0.37.2 pypi_0 pypi sympy 1.12 pypi_0 pypi sysroot_linux-64 2.12 he073ed8_17 conda-forge tk 8.6.13 noxft_h4845f30_101 conda-forge torch 2.3.0 pypi_0 pypi tornado 6.4 pypi_0 pypi tqdm 4.66.4 pypi_0 pypi triton 2.3.0 pypi_0 pypi typer 0.12.3 pypi_0 pypi typing-extensions 4.12.0 pypi_0 pypi tzdata 2024a h0c530f3_0 conda-forge ujson 5.10.0 pypi_0 pypi urllib3 2.2.1 pypi_0 pypi uvicorn 0.29.0 pypi_0 pypi uvloop 0.19.0 pypi_0 pypi watchfiles 0.21.0 pypi_0 pypi wcwidth 0.2.13 pypi_0 pypi websockets 12.0 pypi_0 pypi wheel 0.43.0 pyhd8ed1ab_1 conda-forge xz 5.2.6 h166bdaf_0 conda-forge zstd 1.5.6 ha6fb4c9_0 conda-forge ```
emcc --version
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.60 (42a6ea2052f19f70d7d994e8c324bcad2f1f8939)
Copyright (C) 2014 the Emscripten authors (see AUTHORS.txt)
This is free and open source software under the MIT license.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Steps to reproduce the behavior:

This is roughly the script I used to compile.

(mlc-chat-venv) ubuntu@ip-10-0-4-106:~/mlc-llm$ pwd
/home/ubuntu/mlc-llm
# Create directory
mkdir -p dist/models && cd dist/models
# Clone HF weights
git lfs install
git clone https://huggingface.co/stabilityai/japanese-stablelm-instruct-beta-7b
cd ../..
# Convert weight
mlc_llm convert_weight ./dist/models/ELYZA-japanese-Llama-2-13b-fast-instruct/ \
--quantization q4f16_1 \
-o dist/ELYZA-japanese-Llama-2-13b-fast-instruct-q4f16_1-MLC

# 1. gen_config: generate mlc-chat-config.json and process tokenizers
mlc_llm gen_config ./dist/models/ELYZA-japanese-Llama-2-13b-fast-instruct/ \
--quantization q4f16_1 --conv-template llama-2 \
-o dist/ELYZA-japanese-Llama-2-13b-fast-instruct-q4f16_1-MLC/

# 2. compile: compile model library with specification in mlc-chat-config.json
mlc_llm compile ./dist/ELYZA-japanese-Llama-2-13b-fast-instruct-q4f16_1-MLC/mlc-chat-config.json \
--device webgpu -o dist/libs/ELYZA-japanese-Llama-2-13b-fast-instruct-q4f16_1-webgpu.wasm

There is no error, everything works fine.

After all is done, the output directory looks as following. ```shell (mlc-chat-venv) ubuntu@ip-10-0-4-106:~/mlc-llm$ ls -alF dist/ELYZA-japanese-Llama-2-13b-fast-instruct-q4f16_1-MLC/ total 7224436 drwxrwxr-x 2 ubuntu ubuntu 12288 May 26 21:21 ./ drwxrwxr-x 6 ubuntu ubuntu 4096 Jun 1 13:17 ../ -rw-rw-r-- 1 ubuntu ubuntu 1961 May 26 21:21 mlc-chat-config.json -rw-rw-r-- 1 ubuntu ubuntu 191846 May 26 21:09 ndarray-cache.json -rw-rw-r-- 1 ubuntu ubuntu 114127360 May 26 21:06 params_shard_0.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:06 params_shard_1.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:06 params_shard_10.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_100.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_101.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_102.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_103.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_104.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_105.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_106.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_107.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_108.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_109.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:06 params_shard_11.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_110.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_111.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_112.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_113.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_114.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_115.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_116.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_117.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_118.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_119.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:06 params_shard_12.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_120.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_121.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_122.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_123.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_124.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_125.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_126.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_127.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_128.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_129.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:06 params_shard_13.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_130.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_131.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_132.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_133.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_134.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_135.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_136.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_137.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_138.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_139.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:06 params_shard_14.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_140.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_141.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_142.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_143.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_144.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_145.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_146.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_147.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_148.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_149.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:06 params_shard_15.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_150.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_151.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_152.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_153.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_154.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_155.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_156.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:09 params_shard_157.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:09 params_shard_158.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_159.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:06 params_shard_16.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:09 params_shard_160.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:09 params_shard_161.bin -rw-rw-r-- 1 ubuntu ubuntu 32768000 May 26 21:09 params_shard_162.bin -rw-rw-r-- 1 ubuntu ubuntu 1638400 May 26 21:09 params_shard_163.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:06 params_shard_17.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:06 params_shard_18.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:06 params_shard_19.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:06 params_shard_2.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:06 params_shard_20.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:06 params_shard_21.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:06 params_shard_22.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:06 params_shard_23.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:06 params_shard_24.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:06 params_shard_25.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:06 params_shard_26.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:06 params_shard_27.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:06 params_shard_28.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:06 params_shard_29.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:06 params_shard_3.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:06 params_shard_30.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:06 params_shard_31.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:06 params_shard_32.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:06 params_shard_33.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:06 params_shard_34.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:06 params_shard_35.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:06 params_shard_36.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:06 params_shard_37.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:06 params_shard_38.bin -rw-rw-r-- 1 ubuntu ubuntu 114127360 May 26 21:07 params_shard_39.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:06 params_shard_4.bin -rw-rw-r-- 1 ubuntu ubuntu 28528640 May 26 21:07 params_shard_40.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_41.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_42.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_43.bin -rw-rw-r-- 1 ubuntu ubuntu 32472640 May 26 21:07 params_shard_44.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_45.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_46.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_47.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_48.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_49.bin -rw-rw-r-- 1 ubuntu ubuntu 31991360 May 26 21:06 params_shard_5.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_50.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_51.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_52.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_53.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_54.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_55.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_56.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_57.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_58.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_59.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:06 params_shard_6.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_60.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_61.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_62.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_63.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_64.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_65.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_66.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_67.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_68.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_69.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:06 params_shard_7.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_70.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_71.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_72.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_73.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_74.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_75.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_76.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_77.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_78.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_79.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:06 params_shard_8.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_80.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_81.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_82.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_83.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_84.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_85.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_86.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_87.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_88.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_89.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:06 params_shard_9.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_90.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_91.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_92.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_93.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_94.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_95.bin -rw-rw-r-- 1 ubuntu ubuntu 32952320 May 26 21:07 params_shard_96.bin -rw-rw-r-- 1 ubuntu ubuntu 35389440 May 26 21:07 params_shard_97.bin -rw-rw-r-- 1 ubuntu ubuntu 70778880 May 26 21:07 params_shard_98.bin -rw-rw-r-- 1 ubuntu ubuntu 39321600 May 26 21:07 params_shard_99.bin -rw-rw-r-- 1 ubuntu ubuntu 2398780 May 26 21:12 tokenizer.json -rw-rw-r-- 1 ubuntu ubuntu 705214 May 26 21:12 tokenizer.model -rw-rw-r-- 1 ubuntu ubuntu 983 May 26 21:12 tokenizer_config.json (mlc-chat-venv) ubuntu@ip-10-0-4-106:~/mlc-llm$ ls -alF dist/libs/ELYZA-japanese-Llama-2-13b-fast-instruct-q4f16_1-webgpu.wasm -rwxrwxr-x 1 ubuntu ubuntu 5370482 May 26 21:31 dist/libs/ELYZA-japanese-Llama-2-13b-fast-instruct-q4f16_1-webgpu.wasm* ```

Then I try to run the model we added in WebLLM’s get-started:

import * as webllm from "@mlc-ai/web-llm";

function setLabel(id: string, text: string) {
  const label = document.getElementById(id);
  if (label == null) {
    throw Error("Cannot find label " + id);
  }
  label.innerText = text;
}

async function main() {
  const initProgressCallback = (report: webllm.InitProgressReport) => {
    setLabel("init-label", report.text);
  };
  const appConfig: webllm.AppConfig = {
    model_list: [
      {
        "model_url": "https://huggingface.co/bassari/ELYZA-japanese-Llama-2-13b-fast-instruct-q4f16_1-MLC/resolve/main/",
        "model_id": "ELYZA-japanese-Llama-2-13b-fast-instruct-q4f16_1-MLC",
        "model_lib_url": "https://raw.githubusercontent.com/lawofcycles/web-llm-original-models/main/ELYZA-japanese-Llama-2-13b-fast-instruct-q4f16_1-webgpu.wasm",
        "required_features": ["shader-f16"],
      },
    ]
  };
  const selectedModel = "ELYZA-japanese-Llama-2-13b-fast-instruct-q4f16_1-MLC"
  const engine: webllm.MLCEngineInterface = await webllm.CreateMLCEngine(
    selectedModel,
    { appConfig: appConfig, initProgressCallback: initProgressCallback }
  );

  const reply0 = await engine.chat.completions.create({
    messages: [
      { "role": "user", "content": "List three US states." },
    ],
    // below configurations are all optional
    n: 3,
    temperature: 1.5,
    max_gen_len: 256,
    // 46510 and 7188 are "California", and 8421 and 51325 are "Texas" in Llama-3-8B-Instruct
    // So we would have a higher chance of seeing the latter two, but never the first in the answer
    logit_bias: {
      "46510": -100,
      "7188": -100,
      "8421": 5,
      "51325": 5,
    },
    logprobs: true,
    top_logprobs: 2,
  });
  console.log(reply0);
  console.log(await engine.runtimeStatsText());

  // To change model, either create a new engine via `CreateMLCEngine()`, or call `engine.reload(modelId)`
}

main();

After launch, some error related to tvm occur

スクリーンショット 2024-06-02 1 02 01
/home/ubuntu/anaconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/tvm/web/..//include/tvm/runtime/packed_func.h:1908: Function runtime.TVMArrayCreateView(0: runtime.NDArray, 1: runtime.ShapeTuple, 2: DLDataType, 3: uint64_t) -> runtime.NDArray expects 4 arguments, but 2 were provided.

Could someone please take a look and let me know if I'm overlooking anything or if there's something else I should do?

tqchen commented 3 months ago

This error is mainly due to the TVM version, please try to update tvm to latest nightly and it should be resolved