mlc-ai / llm-perf-bench

Apache License 2.0
109 stars 12 forks source link

[bug] - crashes when doing build following standard ROCM instructions, related to batching code #32

Closed Sing-Li closed 10 months ago

Sing-Li commented 10 months ago

To reproduce:

On a supported AMD machine, follow the ROCM instructions up to building of the "lib" -- and the following crash occurs....

(python311) root@rumlinux:/workspace# rm -rf $PATH_TEST && mkdir $PATH_TEST && rm -rf $PATH_COMPILE && mkdir $PATH_COMPILE && ln -s ${WEIGHT_PATH} ${PATH_TEST}/params && cp $MODEL_CONFIG $PATH_COMPILE/config.json
(python311) root@rumlinux:/workspace# python -m mlc_llm.build \
        --model $PATH_COMPILE \
        --artifact-path $PATH_COMPILE \
        --quantization $QUANTIZATION \
        --max-seq-len 2048 \
        --num-shards $NUM_SHARDS \
        --target rocm --build-model-only
Traceback (most recent call last):
  File "<frozen runpy>", line 189, in _run_module_as_main
  File "<frozen runpy>", line 112, in _get_module_details
  File "/mlc_llm/mlc_llm/__init__.py", line 6, in <module>
    from . import core
  File "/mlc_llm/mlc_llm/core.py", line 19, in <module>
    from mlc_llm.relax_model import (
  File "/mlc_llm/mlc_llm/relax_model/llama_batched_vllm.py", line 7, in <module>
    from tvm.relax.op.nn import attention_var_len
ImportError: cannot import name 'attention_var_len' from 'tvm.relax.op.nn' (/root/micromamba/envs/python311/lib/python3.11/site-packages/tvm/relax/op/nn/__init__.py)
junrushao commented 10 months ago

https://github.com/mlc-ai/mlc-llm/pull/1134

masahi commented 10 months ago

attention_var_len is supposed to be imported from https://github.com/apache/tvm/blob/unity/python/tvm/relax/op/nn/__init__.py#L21

Perhaps the TVM version on docker is not the latest?

Sing-Li commented 10 months ago

docker image has no "embedded" TVM .... every time it is run, it just pip install the latest nightly. So the problem is likely in the nightly. See this line...

https://github.com/mlc-ai/llm-perf-bench/blob/8df4c5f8d6c0010e706412dfdbc9e5ccbc9a4fab/docker/Dockerfile.rocm57.mlc#L20

mjp0 commented 10 months ago

For me, this crash happens when I try to verify the installation with python3 -m mlc_llm.build --help. In fact, I can't seem to be able to run anything.

junrushao commented 10 months ago

Thanks @masahi for the quick response! I will do a sync between.apache/tvm and mlc-ai/relax tonight

junrushao commented 10 months ago

Sync complete. It should go into the nightly today. Please check back tomorrow :)

Sing-Li commented 10 months ago

Verified that original problem fixed by nightly. Build completed successfully.

However, when running the actual MLC benchmark, it crashes:

# python -m mlc_chat.cli.benchmark \
        --model ${PATH_TEST}/params \
        --device "rocm:0" \
        --prompt "What is the meaning of life?" \
        --generate-length 256
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/root/micromamba/envs/python311/lib/python3.11/site-packages/mlc_chat/cli/benchmark.py", line 72, in <module>
    main()
  File "/root/micromamba/envs/python311/lib/python3.11/site-packages/mlc_chat/cli/benchmark.py", line 58, in main
    chat_module = ChatModule(
                  ^^^^^^^^^^^
  File "/root/micromamba/envs/python311/lib/python3.11/site-packages/mlc_chat/chat_module.py", line 737, in __init__
    fcreate_chat_mod = tvm.get_global_func("mlc.llm_chat_create")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/micromamba/envs/python311/lib/python3.11/site-packages/tvm/_ffi/registry.py", line 234, in get_global_func
    return _get_global_func(name, allow_missing)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "tvm/_ffi/_cython/./packed_func.pxi", line 345, in tvm._ffi._cy3.core._get_global_func
ValueError: Cannot find global function mlc.llm_chat_create
junrushao commented 10 months ago

I got a similar report from Prakalp. Looking into it now.

junrushao commented 10 months ago

Bugfix: https://github.com/mlc-ai/mlc-llm/pull/1167

This is because libmlc_llm.so is not loaded into memory yet during benchmarking

Sing-Li commented 10 months ago

Thanks @junrushao ! Who'd knew. And i thought only C++ static initializers would have these sort of loading race-conditions 😀

Anyways, patched manually and it all ran fine.

With these very interesting results -- on a system with the AMD 7940HS (mobile or miniPC) chip, using the IGPU-780m (it allows you to specify how much dedicated memory for the GPU in the BIOS setup as UMA frame buffer -- allocating 6GB to run the Llama 7b) ..... over ROCM got some very respectable results ! ....

# vi /root/micromamba/envs/python311/lib/python3.11/site-packages/mlc_chat/chat_module.py
(python311) root@minisforumlinux:/workspace# python -m mlc_chat.cli.benchmark   --model ${PATH_TEST}/params     --device "rocm:0"       --prompt "What is the meaning of life?"         --generate-length 256
Generated text:
The question has puzzled philosophers, theologians, and scientists for centuries. државите to the concept of meaning, which is a fundamental aspect of human existence, and is closely related to the concept of purpose. Meaning is what gives our lives significance, and is the reason why we do what we do. It is the reason why we get up in the morning, why we work, why we love, and why we die. Without meaning, life is empty and lacking in purpose.
The concept of meaning is complex and multifaceted, and can be understood in different ways depending on one's cultural, philosophical, and religious beliefs. Some may argue that meaning is derived from religious or spiritual beliefs, while others may see it as a product of personal fulfillment, social connections, or individual achievements.
In this essay, I will explore the concept of meaning and its significance in human life, drawing on philosophical, psychological, and religious perspectives. I will argue that meaning is not a fixed or static concept, but rather a dynamic and evolving one that is shaped by our experiences, beliefs, and values. I will also examine the various sources of meaning in life, including

Statistics: ----------- prefill -----------
throughput: 5.4 tok/s
total tokens: 7 tok
total time: 1.3 s
------------ decode ------------
throughput: 17.9 tok/s
total tokens: 256 tok
total time: 14.3 s
junrushao commented 10 months ago

Wow 🤩 this is amazing to see 18 tok/sec using a simple iGPU! Is this even real???

Sing-Li commented 10 months ago

@junrushao yeah - I had to do a sanity check on my notebook with RTX-3060 Mobile chip ... and it did produce the following score with the latest cuda drivers. Less than 3x the new AMD IGPU's performance for 7b llama !

I've gotta stop telling people that GPU does LLM 1000s of times better than CPU alone 🙄

# vi /root/micromamba/envs/python311/lib/python3.11/site-packages/mlc_chat/chat_module.py
(python311) root@docker-desktop:/workspace# python -m mlc_chat.cli.benchmark    --model ${PATH_TEST}/params     --device "cuda:0"       --prompt "What is the meaning of life?"         --generate-length 256
Generated text:
These are some of the big questions that have puzzled philosophers, theologians, and scientists for centuries. nobody has a definitive answer, but here are some possible approaches to understanding the meaning of life: 1. Religious or spiritual beliefs: Many people believe that the meaning of life is to fulfill a divine or spiritual purpose. According to this view, life has a higher purpose that is connected to a deity or a higher power. The purpose of life is to fulfill this divine plan or to follow the teachings of a particular religion. 2. Personal growth and development: Some people believe that the meaning of life is to grow and develop as individuals. According to this view, the purpose of life is to learn, to acquire knowledge, to develop skills, and to become the best version of oneself. 3. Social connections and relationships: Many people believe that the meaning of life is to form and maintain meaningful relationships with others. According to this view, the purpose of life is to love and be loved, to form connections with others, and to build a supportive community. 4. Contribution and legacy: Some people believe that the meaning of life is to make a positive impact on the world and to

Statistics: ----------- prefill -----------
throughput: 42.2 tok/s
total tokens: 7 tok
total time: 0.2 s
------------ decode ------------
throughput: 45.1 tok/s
total tokens: 256 tok
total time: 5.7 s