memory-efficient-tuning Search Results

1000+ results
for memory-efficient-tuning

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/torchtune #832

How to save a trained model so it can be loaded with HF `fro…

I'm finding this repo to be a user friendly, extensible, memory efficient solution for training/fine-tuning models. However, when it comes to inference, there is a usability gap that could be solved b…

calmitchell617 updated 1 month ago
29
OpenNMT/CTranslate2 #1186

Support peft's LoRa for HF transformer models.

Context: With HF models, one can use [peft](https://github.com/huggingface/peft) to do parameter efficient tuning, the most popular (and afaik most performant) method being LoRa. Idea: It would be …

Palmik updated 4 months ago
4
oushu1zhangxiangxuan1/drools_pre_research #8

Drools HA & spark integration？

https://stackoverflow.com/questions/58860448/distributed-rules-engine In the first place, I can see for huge voluminous data as well we can apply Drools efficiently out of my experiences with it (m…

oushu1zhangxiangxuan1 updated 8 months ago
4
faster-cpython/ideas #701

Communication about faster-cpython after 3.13 release?

I'm teaching Python for sciences and try to understand what happens with the different projects to improve Python performance. I tried to follow faster-cpython but I have to admit that I feel a bit…

paugier updated 6 days ago
1
rapidsai/cugraph #4695

CuGraph hungarian matching slower than CPU based SciPy imple…

I have to compute a lot of hungarian matchings between sets of points using their distance as the matching criterion. So far I have tried this hybrid Scipy (CPU) and PyTorch method: ``` import numpy …

JamesMcCullochDickens updated 1 week ago
2
GirinMan/HYU-Graduation-Project-Quantization #15

polyglot-ko 모델 quantized lora tuning

## 개요 - LLM.int8() + LoRA를 활용한 memory&parameter efficient fine tuning - BitsAndBytes + Peft 활용한 모델 학습 예정 - Backbone은 [polyglot-ko-5.8b](https://huggingface.co/EleutherAI/polyglot-ko-5.8b) 활용(KoGPT는…

GirinMan updated 1 year ago
1
mukel/llama3.java #7

Improve matrix multiplication using the Java Vector API on A…

llama.cpp runs incredibly fast on Apple silicon, I ran a build with pure CPU, and it is closer to the memory bandwidth e.g. 28 tokens/s on an M3 Pro. llama3.java seems to be rather slow on Apple sili…

mukel updated 3 days ago
3
pytorch/torchtune #1694

Bug when I run on single GPU

**Command: tune run lora_finetune_single_device --config llama3_1/8B_lora_single_device** **Output**: ``` INFO:torchtune.utils._logging:Running LoRAFinetuneRecipeSingleDevice with resolved config:…

kailashg26 updated 6 hours ago
24
NielsRogge/Transformers-Tutorials #430

Confidence score for paligemma

Hi @NielsRogge I have finetuned my paligemma for custom data for image to JSON use case, but when I inference it some key values I got wrong like 3000 is extracted as 9000 so to get the data is corr…

himasai9712 updated 4 months ago
7
pytorch/torchtune #1818

qwen2 is not supported by QAT

i try to use QAT to quantize qwen2 1.5B model The error raise from function `training.load_from_full_model_state_dict( model, model_state_dict, self._device, self._is_rank_zero, strict=T…

elfisworking updated 17 hours ago
2

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for memory-efficient-tuning

1000+ results
for memory-efficient-tuning