Closed execveat closed 1 year ago
Thanks you for your input. Indeed there are a lot of things we can improve. This is the beginning of the release so indeed there are a lot of things can be added on top. We will release followup materials on guides and local builds.
There are two components of the project.
The MLC(machine learning compilation) part, which is the overall productive flow of adding new models and backend optimizations, they are build on top of TVM unity pipeline. The pipeline itself is generic to adapt and add new models.
The mlc_chat_cli
is a runtime component that runs the compiled code, so the memory consumption and cost profile depends on the related models. This is a module that will work with any of the compiled models. It is a great suggestion that we can mention the overall requirement of some of the prebuilt settings as we expand more model support in the community.
Additionally, there is a libmlc_llm
module (which mlc_chat_cli depends on) that can be used to be embed into any of the applications (e.g. a game engine) that would like to leverage the MLC-LLM.
One thing to mention is that the overall MLC flow is in python and highly customizable. For example, we could easily add 3bit int, or new formats like 4bit floating points to the python flow that may or may not sit in this repo. It took us about the order of say a few days to explore a few different quantization format and use ML compilation optimize and generate high performing code.
And yes, as an open source community, we love contributions and pull requests.
Thank you for the explanation! I see that there is support for more models in mlc_llm/conversation.py
, but the list in cpp/cli_main.cc
is more limited. I guess this is just work in progress?
I would greatly suggest option to override profile selection via command line argument instead of always taking it from the path name. And moving profile / template definitions into an user-editable config file would be amazing as well (e.g. to customize the prompt and temperature).
Are there instructions for how to convert existing models to be used with mlc-llm? Reading this current thread, and this one, it seems possible, but I've not found any hints as to how to start.
Thank you for your suggestion, we will work on the instructions in the incoming weeks. The current build.py pipeline should support the llama class and there is WIP on other classes of models
Hi, is this project mainly working on LLMs? I wonder if the MLC flow works for image generation models (e.g., Stable Diffusion).
@yx-chan131 yes, checkout https://github.com/mlc-ai/web-stable-diffusion
Looking forward to the instruction, I am waiting for integrating it in my chat bot. https://github.com/Poordeveloper/chatgpt-app
Thank you for your suggestion, we will work on the instructions in the incoming weeks. The current build.py pipeline should support the llama class and there is WIP on other classes of models
Yup any start would be fine! looking forward to this
(Let me know if you need any help! Worked with LLM in production on for big GPU's )
Closing this issue for now due to inactivity. Feel free to reopen or open another issue if there are other questions!
Hey there, congratulations on a great release! The app works great on a Mac and the installation was very straightforward.
Do you have plans for growing the
mlc_chat_cli
into a standalone tool or is it meant to be a proof of concept? Readme claims the project can be used to run 'any language model', but there are no instructions for how to do it. Furthermore, code seems to indicate that only three models are supported right now, is that right?Unless the
mlc_chat_cli
is supposed to be a toy demo, could you please add instructions for:git clone https://huggingface.co/mlc-ai/demo-vicuna-v1-7b-int3 dist/vicuna-v1-7b
) would work, right?mlc_chat_cli
doesn't expose that info/settings to the user. How do we tweak those?Also, it would be very neat if you mentioned in the Readme, what kind of community interactions are you aiming for. Would you prefer that people build their own tools that use mlc-llm as a backend or send PRs for improving mlc_chat_cli?