turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.66k stars 281 forks source link

Does NVLink improve tensor parallelism? #603

Open bryanhpchiang opened 2 months ago

bryanhpchiang commented 2 months ago

With 2x3090 - does the recently added tensor parallelism use NVLink in any manner? Thanks!

turboderp commented 2 months ago

It does not, no. Not yet anyway.