efficient-llm Search Results

1000+ results
for efficient-llm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

oneapi-src/oneDNN #1788

GEMM API for efficient LLM inference with W8A16

I want to perform inference on quantized LLAMA (W8A16) on ARM-v9 (with SVE) using oneDNN. The LLAMA weights are per-group quantized. Based on my understanding, I need to prepack the weights to redu…

oleotiger updated 1 week ago
3
aihacker111/Efficient-Live-Portrait #12

So apologize for everyone came to see this repo

In many near day ago, I’m assigned to a new research paper about Stable Diffusion and LLM, so I don’t have much time for updating new feature, but now the model is came to training process, so I come …

aihacker111 updated 5 days ago
1
pytorch/pytorch #130174

make torch.compile work with vLLM (facebook/opt-125m , meta-…

### 🚀 The feature, motivation and pitch [vLLM](https://github.com/vllm-project/vllm) is a high-throughput and memory-efficient inference and serving engine for LLMs. We would like to use `torch.compi…

youkaichao updated 6 hours ago
21
aleximmer/Laplace #203

Efficient, universal, standalone Jacobian backend

- Functorch = memory blowup due to `vmap` - Asdl/asdfghjkl = can't backprop through the Jacobians => can't be used for continuous BO - BackPACK = requires inflexible extension We need a Jacobian …

wiseodd updated 2 weeks ago
2
OpenDevin/OpenDevin #2593

Implement Cost Reduction Options for LLM Usage in OpenDevin

**What problem or use case are you trying to solve?** We are trying to reduce the costs associated with using Large Language Models (LLMs) in the OpenDevin project. This involves optimizing the usa…

PierrunoYT updated 2 weeks ago
1
Sally-SH/VSP-LLM #5

The loss becomes to nan when batch size > 1

in the training process, I found that if I set batch size > 1, the loss sometimes will be nan ``` tensor(nan, device='cuda:0', grad_fn=) ``` and some logits also nan ``` (Pdb) p llm_out.logits…

ReflectionL updated 1 week ago
1
ggerganov/llama.cpp #8485

Feature Request: T-MAC: CPU Renaissance via Table Lookup for…

### Prerequisites - [X] I am running the latest code. Mention the version if possible as well. - [X] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md)…

sorasoras updated 1 week ago
4
agiresearch/AIOS #127

[Roadmap] AIOS Roadmap Q4 2024

## Better integration of LLM kernel and OS kernel - [ ] Translate current implementation to more efficient implementation (more efficient and still cross-platform) - [ ] Multi-thread/Multi-proce…

evison updated 2 weeks ago
1
AkihikoWatanabe/paper_notes #1235

User-LLM: Efficient LLM Contextualization with User Embeddin…

# URL - https://arxiv.org/abs/2402.13598 # Affiliations - Lin Ning, N/A - Luyang Liu, N/A - Jiaxing Wu, N/A - Neo Wu, N/A - Devora Berlowitz, N/A - Sushant Prakash, N/A - Bradley Green, …

AkihikoWatanabe updated 3 months ago
1
agiresearch/AIOS #117

[Roadmap] AIOS Roadmap Q3 2024

This document outlines the long-term features in the AIOS roadmap for Q3 2024. Feel free to discuss any of the following topics, and add any other topics you'd like to talk about in this issue. ## …

Tata0703 updated 3 weeks ago
3

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for efficient-llm

1000+ results
for efficient-llm