This PR pulls the latest changes in the upstream.
One of the key change is the renaming of mlc_chat to mlc_llm as they deprecate the legacy flow in pre-SLM era.
Once this is merged, I will follow-up with the ollm changes.
During resolving the merge conflicts, I found that there are a couple of subtle differences in the quantization flow and presharding.
I believe I currently kept the existing behavior, but it would be great if @csullivan @vinx13 @JosephTheOctonaut can confirm.
For the review, it is safe to ignore other directories outside of python/.
This PR pulls the latest changes in the upstream. One of the key change is the renaming of
mlc_chat
tomlc_llm
as they deprecate the legacy flow in pre-SLM era. Once this is merged, I will follow-up with the ollm changes.During resolving the merge conflicts, I found that there are a couple of subtle differences in the quantization flow and presharding. I believe I currently kept the existing behavior, but it would be great if @csullivan @vinx13 @JosephTheOctonaut can confirm.
For the review, it is safe to ignore other directories outside of
python/
.