hahnyuan / LLM-Viewer

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
MIT License
275 stars 32 forks source link

A40 MAC nubmer #7

Closed sunshinemyson closed 4 months ago

sunshinemyson commented 4 months ago

Hi hahnyuan,

Good to know you add A40 in this tool, i'm wondering why you divided FP16 OPS by 2 in hardware configuration?According to official information, it should 149 TFLOPS.

Thanks

hahnyuan commented 4 months ago

Thank you for bringing the discrepancy in the hardware configuration to our attention. We apologize for the mistake regarding the division of FP16 OPS by 2. The correct value for the A40 is indeed 149 TFLOPS. We have updated the tool to reflect the accurate information.