atoma-network atoma-paged-attention issues - Githubissues

atoma-network / atoma-paged-attention

Paged attention cuda kernels for the Atoma protocol

1 stars 5 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

feat: paged attention llama

#8 jorgeantonio21 opened 1 week ago
0
feat: Integrate Llama cuda kernels with Candle

#7 fishonamos opened 1 week ago
0
feat: Integrate Llama cuda kernels with Candle

#6 fishonamos closed 1 week ago
0
feat: add paged attention kernels for the Llama model architecture

#5 fishonamos opened 2 weeks ago
0
Test the paged attention vs the original candle implementations

#4 jorgeantonio21 opened 3 weeks ago
5
Add a Llama inference main file

#3 jorgeantonio21 opened 3 weeks ago
4
Integrate Llama cuda kernels with Candle

#2 jorgeantonio21 opened 3 weeks ago
5
Add paged attention kernels for the Llama model architecture

#1 jorgeantonio21 opened 3 weeks ago
7