Open cha-noong opened 2 months ago
@cha-noong yes sorry, i need to update the other HF-based APIs (its on my TODO list), alas I know from benchmarking them none are as fast as MLC which is why I use that, and it's more likely to add TensorRT-LLM backend when that becomes available for Jetson.
Thank you for quick response.
As far as I know, TensorRT-LLM is not yet supported. Can I know roughly when this will be possible?
When I looked at the readme document for local_llm, it said that mlc and awq are supported as backends.
However, when I run it, awq is commented out (./models/init), and the related installation and dependency appear to be missing from the Docker file.
I wonder if it's still in development.