Open imtuyethan opened 1 month ago
Off topic:
These descriptions are not helpful:
114 (windows-dev-tensorRT-llm) OS: Windows 11 Pro (Version 23H2, build 22631.4037) CPU: AMD Ryzen Threadripper PRO 5955WX (16 cores) RAM: 32 GB GPU: NVIDIA GeForce RTX 3090 Storage: 599 GB local disk (C:)
Codestral 22B Q4:
The response is weird:
https://github.com/user-attachments/assets/5380f2b7-d137-423d-beaa-21d41e33d67f
https://github.com/user-attachments/assets/3d2785d9-5abe-42e2-8dae-0582c883d1c1
Operating System: MacOS Sonoma 14.2 Processor: Apple M2 RAM: 16GB
Seems like wrong prompt template?
With the same prompt, Llama 3.2 1B Instruct Q8 gave me a correct/thorough answer.
Operating System: MacOS Sonoma 14.2 Processor: Apple M2 RAM: 16GB
Gemma 1.1 2B Q4
Wrong prompt template?
Operating System: MacOS Sonoma 14.2 Processor: Apple M2 RAM: 16GB
Llama 2 Chat 7B Q4
Wrong prompt template?
Operating System: MacOS Sonoma 14.2 Processor: Apple M2 RAM: 16GB
CodeNinja 7B Q4
Wrong prompt template?
Operating System: MacOS Sonoma 14.2 Processor: Apple M2 RAM: 16GB
LlaVa 7B
Weird responses:
Reported by user: https://zoom.us/clips/share/riUumJZ0uuzb5vQvZ2eZbMkmOq1nvU7O8VTD5FuBNtxRaO89rp9xA7CibJFCLlGju3nfyLsB_19iPegc0nSM4qxV.POPOcY7WXml_Ab8P
https://github.com/user-attachments/assets/e44274e7-725d-4927-aeb1-8cb03e6831b9
https://github.com/user-attachments/assets/33f2afa1-8f87-4356-bd6f-855a33124eb8
Operating System: MacOS Sonoma 14.2 Processor: Apple M2 RAM: 16GB
Missing stop word?
Operating System: MacOS Sonoma 14.2 Processor: Apple M2 RAM: 16GB
Missing stop word?
Operating System: MacOS Sonoma 14.2 Processor: Apple M2 RAM: 16GB
Weird format:
Operating System: MacOS Sonoma 14.2 Processor: Apple M2 RAM: 16GB
Issue similar as LlaVa 7B
Operating System: MacOS Sonoma 14.2 Processor: Apple M2 RAM: 16GB
Gemma 1.1 7B Q4
Wrong prompt template?
Operating System: MacOS Sonoma 14.2 Processor: Apple M2 RAM: 16GB
LlaVa 13B Q4
Wrong prompt template?
Operating System: MacOS Sonoma 14.2 Processor: Apple M2 RAM: 16GB
Wizard Coder Python 13B Q4
Wrong prompt template?
114 (windows-dev-tensorRT-llm) OS: Windows 11 Pro (Version 23H2, build 22631.4037) CPU: AMD Ryzen Threadripper PRO 5955WX (16 cores) RAM: 32 GB GPU: NVIDIA GeForce RTX 3090 Storage: 599 GB local disk (C:)
Command-R v01 34B Q4
Pretty sure wrong prompt template:
@imtuyethan I recommending converting the Checklist you have above, into a table so we can track the status/fixing status.
Please work with @hahuyhoang411 - it may be that certain models are unsavable, and we should just remove them from the library.
Device: windows-dev-tensorrt-llm Status: Running Node: 3x-3090s CPU: 1.26% of 16 RAM: 6.06/96 GiB Disk: 600 GiB
Mixtral 8x7B Instruct Q4
Device: windows-dev-tensorrt-llm Status: Running Node: 3x-3090s CPU: 1.26% of 16 RAM: 6.06/96 GiB Disk: 600 GiB
Phind 34B Q4
Device: windows-dev-tensorrt-llm Status: Running Node: 3x-3090s CPU: 1.26% of 16 RAM: 6.06/96 GiB Disk: 600 GiB
Llama 2 Chat 70B Q4
I have QA-ed all models, please check ticket description for the latest update:
Current hub contains a lot of outdated models, and some new models have a prompt template bug. Here is my suggestion based on @imtuyethan QA-ed list:
The rationale for this delete list is model has been released >6months will be removed. Delete list:
Keep list:
LLM:
VLM: VLMs are a bit more tricky LLava 1.6 (new) Qwen2-VL-7B-Instruct (new) Pixtral-12B-2409 (new) Llama-3.2-11B-Vision-Instruct (new) GOT-OCR2_0 (new) Molmo-7B-D-0924 (new) MiniCPM-V-2_6 (new)
@hahuyhoang411 Should we add more new/trending models? The list seems short for a whole model hub.
We can delete them from Hub, but they still show up on the users' side if they have downloaded these legacy models. How do we inform them when these models don't work?
Problem
I have encountered many issues with the wrong model default settings (incorrect prompt template, the stop words missing, etc.). e.g., comments in Jan 0.5.7 Release Sign Off janhq/jan#3818
Model Testing Results
I have tested 45 models from Jan Hub, here are the results.
Next step
cc @hahuyhoang411
On one note
We will need to develop model.yaml to easily define model capabilities (e.g. function calling, vision, etc). Users are facing an issue with imported LlaVa: https://github.com/janhq/jan/issues/3855