parameter-efficient-fine-tuning Search Results

1000+ results
for parameter-efficient-fine-tuning

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

facebookresearch/fairseq #4930

"OOM during optimization" when fine-tuning NLLB

## ❓ Questions and Help #### What is your question? Hi, I am getting "OOM during optimization, irrecoverable" when trying to fine-tune the 3.3B parameter NLLB model. ##### Stack trace: ```…

zgerrard updated 7 months ago
10
njzjz/wenxian #23

Run `wenxian` in an issue

In this issue, [comment with `@njzjz-bot `](https://github.com/njzjz/wenxian/issues/23#issuecomment-2121998880): ``` @njzjz-bot 2312.15492 ``` [The GitHub Actions will reply](https://github.co…

njzjz updated 5 months ago
22
UMass-Foundation-Model/FlexAttention #3

[Question] Regarding batch size and fine-tuning methods

### Question Great work ! 1. According to the paper, the batch size is set to 1152. How many graphics cards will be used during the training phase? 2. Is training full fine-tuning or efficient para…

yinangit updated 3 months ago
3
mosaicml/examples #237

RuntimeError: Triton Error [CUDA]: invalid argument

Getting the following issue when running mosaic-bert recipe. Only with bf16, works with fp32. ``` Traceback (most recent call last): File "", line 21, in _bwd_kernel KeyError: ('2-.-0-.-0-d82…

sameerreddy13 updated 1 year ago
17
keyvank/femtoGPT #15

How to add a new decoder after gpt is created with ::new cal…

hi @keyvank , let's say I want to add a new decoder layer (the one that gets constructed as part of 0..num_layers loop) at run time after the gpt::new() call, how do I go about it ? As I understand y…

cutoken updated 1 year ago
18
AkihikoWatanabe/paper_notes #837

Tailor: A Soft-Prompt-Based Approach to Attribute-Based Cont…

https://virtual2023.aclweb.org/paper_P5680.html

AkihikoWatanabe updated 1 year ago
1
apache/mxnet #2919

kvstore=device uses a large amount of GPU memory

kvstore_device allocate a buffer memory for each GPU, see https://github.com/dmlc/mxnet/blob/master/src/kvstore/kvstore_device.h#L87 It is problematic for big network such as VGG/Alexnet and there ar…

mli updated 3 years ago
13
MaartenGr/BERTopic #582

Revisiting HDBSCAN tuning and topic clustering

This is a long post and I apologize going in. I also want to make clear that none of this should be read as a criticism of BERTopic or the choices made of where the product has focused its attention. …

drob-xx updated 2 years ago
25
Bryce1010/DeepLearning-Project #5

Pytorch

## 资源 - The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch. [[github]](https://github.com/ritchieng/the-incredible-pytorch)

Bryce1010 updated 4 years ago
4
InternLM/InternLM-XComposer #426

2d5-7b : I found the LoRA-checkpoint saved with multiple gpu…

I found with 2d5-7b the checkpoint saved from LoRA tuning finetune.py with one GPU is correct, while with multiple GPU the model saved is incorrect. Does anyone met similar problem? For example…

YerongLi updated 2 months ago
16

上一页 1...20 21 22 23 24 25 26...100 下一页

1000+ results for parameter-efficient-fine-tuning

1000+ results
for parameter-efficient-fine-tuning