I think a useful parameter/feature LLM-VM should support is the ability to change the dtype of a model. In this HuggingFace doc, it is mentioned that this can be done by doing something like this:
model.to(dtype=torch.bfloat16)
I am doing something like this in llm-speed-benchmark, so I do think this is possible.
I would love to get peoples' option on this as well.
I think a useful parameter/feature LLM-VM should support is the ability to change the dtype of a model. In this HuggingFace doc, it is mentioned that this can be done by doing something like this:
I am doing something like this in llm-speed-benchmark, so I do think this is possible.
I would love to get peoples' option on this as well.