Questions about local-llm awq support

dusty-nv / jetson-containers

Machine Learning Containers for NVIDIA Jetson and JetPack-L4T

MIT License

1.89k stars 416 forks source link

Questions about local-llm awq support #474

Open cha-noong opened 2 months ago

cha-noong commented 2 months ago

When I looked at the readme document for local_llm, it said that mlc and awq are supported as backends.

However, when I run it, awq is commented out (./models/init), and the related installation and dependency appear to be missing from the Docker file.

I wonder if it's still in development.

dusty-nv commented 2 months ago

@cha-noong yes sorry, i need to update the other HF-based APIs (its on my TODO list), alas I know from benchmarking them none are as fast as MLC which is why I use that, and it's more likely to add TensorRT-LLM backend when that becomes available for Jetson.

cha-noong commented 2 months ago

Thank you for quick response.

As far as I know, TensorRT-LLM is not yet supported. Can I know roughly when this will be possible?