issues
search
mlc-ai
/
mlc-llm
Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.26k
stars
1.58k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[Bug] issue while compiling FP8 dataype of mlc-ai/Llama-3.1-8B-Instruct-fp8-MLC
#3047
Vinaysukhesh98
opened
17 minutes ago
0
[Model] Add support for OLMo architecture
#3046
Lanssi
opened
2 days ago
2
[Model] Add support for Olmo architecture
#3045
tlopex
closed
1 day ago
0
[Bug][iOS/Swift SDK] Multiple image input to vision models will throw error from TVM
#3044
Neet-Nestor
opened
4 days ago
0
[Bug] internlm2_5-20b-q0f16-MLC模型对话胡言乱语
#3043
l241025097
opened
4 days ago
2
[C++] Invoke storage allocation for CUDA Graph explicitly
#3042
MasterJH5574
closed
4 days ago
0
Update copyright to 2023-2024
#3041
MasterJH5574
closed
4 days ago
0
[Python] Skip termination when engine is not initialized
#3040
MasterJH5574
closed
4 days ago
0
[Bug] Inference with llava throws an error
#3039
HoiM
opened
4 days ago
2
[Model] Add support for Xverse Model
#3038
tlopex
closed
5 days ago
0
[Bench] Add support for multiple backend
#3037
cyx-6
opened
5 days ago
0
[Bug] macbook pro m4 max apple silicon mlc_llm compile qwen2.5 q4f32 mlc .so error
#3036
l241025097
closed
4 days ago
6
[Bug] When initializing MLCEngine, getting AttributeError: 'MLCEngine' object has no attribute '_ffi'
#3035
lifelongeeek
closed
3 days ago
5
[Question] Does MLC_LLM MLCEngine have an equivalent API for `llm.generate` in VLLM or SGLang?
#3034
pjyi2147
opened
1 week ago
0
KV cache offloading to CPU RAM
#3033
shahizat
opened
1 week ago
0
[JIT] Support overriding optimization flags in JIT
#3032
MasterJH5574
closed
1 week ago
0
[Feature Request] Add vision model flag to model record
#3031
Neet-Nestor
opened
1 week ago
1
[Python] Add sentencepiece as installation requirement
#3030
MasterJH5574
closed
1 week ago
0
[3rdparty] Bump tokenizers-cpp for HuggingFace tokenizer update
#3029
MasterJH5574
closed
1 week ago
0
[Model] Keep vision encoder weights unquantized to maintain accuracy
#3028
mengshyu
closed
1 week ago
0
[Question] Cannot serve model using multi GPU
#3027
ro99
closed
1 week ago
2
[Fix] Disable FlashInfer when sliding window is enabled
#3026
MasterJH5574
closed
1 week ago
0
[3rdparty] Bump tokenizer-cpp to enable SentencePiece by default
#3025
MasterJH5574
closed
1 week ago
0
[Question] Could not use multi GPU in chat
#3024
BobH233
closed
1 week ago
3
[Bug] Cannot run finetuned model of Mistral 7B with `mlc_llm convert_weights` with "data did not match any variant of untagged enum ModelWrapper"
#3023
pjyi2147
closed
1 week ago
12
Speculative mode for the LLaMA 3.1 70B model
#3022
shahizat
closed
3 days ago
2
[Docs] Fix typo export command with compile.
#3021
pongib
closed
2 weeks ago
0
[docs] Updated conversation template doc
#3020
Kaneki-x
closed
1 week ago
0
[Docs] Update conversion template link address
#3019
Kaneki-x
closed
2 weeks ago
0
Support of heterogeneous devices
#3018
musram
closed
1 week ago
1
[Bug] flutter 跟安卓原生交互,调用engine.chatCompletion 就会发生anr
#3017
tdd102
opened
2 weeks ago
0
[Bug] internlm2_5模型mlc_llm serve执行异常
#3016
l241025097
closed
1 week ago
9
[Grammar] Migrate to XGrammar
#3015
Ubospica
closed
2 weeks ago
0
[Question] How to show model progress download on WebLLM Javascript SDK?
#3014
DenisSergeevitch
closed
1 week ago
1
[Question] How to export MLCChat .apk with weight bundled/included?
#3013
lifelongeeek
closed
1 week ago
1
[Model] Add support for GPTJ architecture
#3012
tlopex
opened
3 weeks ago
4
[Bug] Speculative decoding doesn't work on Vulkan (AMD iGPU)
#3011
SkyHeroesS
opened
3 weeks ago
0
[Question] Android app issue
#3010
j0h0k0i0m
opened
3 weeks ago
2
[Model] Update default prefill chunk size of Deepseek
#3009
tlopex
closed
1 week ago
1
Fix relativ path comment
#3008
NWuensche
closed
3 weeks ago
0
[Docs] Update model template link address
#3007
Kaneki-x
closed
2 weeks ago
1
Simplify obvious choices in gen_cmake_config.py
#3006
jeethu
closed
3 weeks ago
0
[Bug] large concurrency service broken
#3005
fan-niu
opened
3 weeks ago
4
[Bug] Llama-3.1-70B-Instruct-q3f16_1-MLC model running across two GPUs with tensor_parallel_shards=2
#3004
shahizat
opened
3 weeks ago
2
[Bug] DLL error
#3003
ArpanDhot
opened
3 weeks ago
0
[Bug] Misalignment of Llama3.2 chat template
#3002
Hzfengsy
opened
3 weeks ago
0
[Question] Error running prep_emcc_deps.sh - 'tvm/runtime/object.h' file not found
#3001
Big-Boy-420
opened
3 weeks ago
7
[Fix] Typo in serve/engine.py
#3000
shyeonn
closed
3 weeks ago
0
[Question] Which models do you recommend for compiling on Mac Intel chip, metal gpu?
#2999
RINO-GAELICO
opened
4 weeks ago
0
[Feature Request] openai v1/completion api support
#2998
lpb1
closed
4 weeks ago
1
Next