Closed Noir-Lime closed 1 year ago
I noticed that nvcc isn't being configured to compile with any configuration options. I've added -O3 as a start.
I've noticed a gain from ~ 42 T / s to ~ 47 T / s running this model: https://huggingface.co/TheBloke/wizard-vicuna-13B-GPTQ on a RTX 3090
I invite others to test different nvcc optimization options to see if better performance can be achieved.
Supposedly this is a no-op, as -O3 is the default.
I noticed that nvcc isn't being configured to compile with any configuration options. I've added -O3 as a start.
I've noticed a gain from ~ 42 T / s to ~ 47 T / s running this model: https://huggingface.co/TheBloke/wizard-vicuna-13B-GPTQ on a RTX 3090
I invite others to test different nvcc optimization options to see if better performance can be achieved.