issues
search
jrudolph
/
llama2.scala
Inference Llama 2 in Scala with AVX2 kernels in C (A port of llama2.c from Andrej Karpathy)
Other
67
stars
3
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Setup GgufLoader and preliminary Q6_K dequantization to allow running Q4_0 models
#13
jrudolph
opened
1 year ago
0
Q4 quantization with SIMD support
#12
jrudolph
closed
1 year ago
1
Type-safe Tensors
#11
jrudolph
opened
1 year ago
0
q8 quantization (SIMD / AVX2)
#10
jrudolph
closed
1 year ago
0
some manual AVX2 code to optimize the slowest step
#9
jrudolph
closed
1 year ago
0
q8 + q4 quantization + loading Q4_0 from ggml model files
#8
jrudolph
closed
1 year ago
1
q8 quantization
#7
jrudolph
closed
1 year ago
0
Try explicit Java vector API
#6
jrudolph
opened
1 year ago
3
Try GPU-accelleration with TornadoVM
#5
jrudolph
opened
1 year ago
0
Allow loading llama2_7b model
#4
jrudolph
closed
1 year ago
1
Experimental scala-native setup
#3
jrudolph
closed
1 year ago
0
Try with scala-native
#2
jrudolph
closed
1 year ago
4
Figure out how to run official llama2 models
#1
jrudolph
closed
1 year ago
5