mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.5k stars 194 forks source link

Version of Nvidia Jetson Orin used for TinyChat benchmarks #57

Open retunelars opened 1 year ago

retunelars commented 1 year ago

You report some benchmarking numbers for TinyChat running on an Nvidia Jetson Orin device, but it is not clear which version of the device you are using. Is it a Nano, NX or AGX and with how much memory? Please update the TinyChat benchmark with this information.

malbu commented 4 months ago

Also would love to find this out, ty!