jrudolph / llama2.scala

Inference Llama 2 in Scala with AVX2 kernels in C (A port of llama2.c from Andrej Karpathy)
Other
67 stars 3 forks source link

Setup GgufLoader and preliminary Q6_K dequantization to allow running Q4_0 models #13

Open jrudolph opened 1 year ago