artidoro qlora issues - Githubissues

artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

https://arxiv.org/abs/2305.14314

MIT License

10.06k stars 822 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

How to implement normal float NF4?

#299 XA23i opened 4 days ago
0
llama 3 8b results

#298 dorsa-zeinali opened 2 months ago
0
反向传播时，梯度是如何计算的

#297 21-10-4 opened 4 months ago
0
llama 3 -support?

#296 LuoyaoChen opened 5 months ago
0
additional load_in_4bit removed

#295 shirinyamani opened 5 months ago
0
Qlora with flan-t5 issue - ValueError: Trying to set a tensor of shape torch.Size([4096, 4096])

#294 JhonDan1999 opened 5 months ago
0
Paged optimizer vs gradient checkpointing?

#293 LeoPerelli opened 6 months ago
0
Error when loading model

#292 m000lie opened 6 months ago
3
Llama 1 7b MMLU results largely diverges from reported

#291 Edenzzzz opened 7 months ago
0
a critical loss drop happen after each epoch ending

#290 Coco58323 opened 7 months ago
0
Question about deployment of fine tuned model

#289 Brandon371 opened 8 months ago
0
Fuyu-8B qLora

#288 SinanAkkoyun opened 9 months ago
0
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::BFloat16

#287 andeyeluguo closed 9 months ago
1
[Questions]: How to implement NF4/NF2 matmul kernel function?

#286 llCurious opened 10 months ago
2
Table 4 and Table 5 have different results

#285 lemyx opened 10 months ago
0
[Bug] large CUDA memory usage in the evaluation phase

#284 ChenMnZ opened 10 months ago
1
How to support FLAN v2 dataset.

#283 ChenMnZ opened 11 months ago
0
How do you use oasst1 dataset in qlora.py - why only the 'text' field is used?

#282 Huxwell opened 11 months ago
0
Using QLORA for Multi Modal Vison Foundation Models Optimization - google/owlv2-base-patch16-ensemble

#281 solomonmanuelraj opened 11 months ago
0
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`

#280 Juanerx opened 11 months ago
1
Issue with Yi 34B Training EOS token not working

#279 mrmuke closed 11 months ago
1
Saving/Loading qlora adapters

#278 chrisi2045 opened 12 months ago
1
Merge issue

#277 qburst-fidha opened 1 year ago
0
Multi-GPU Training Giving Different Loss

#276 nikhil-ghosh-berkeley opened 1 year ago
1
Unable to generate predictions

#275 SamarthMM opened 1 year ago
0
Garbage output of Llama-2-13B-chat model after qlora finetuning

#274 cywsg opened 1 year ago
0
Quantization aware finetuning?

#273 SinanAkkoyun opened 1 year ago
0
Qlora Read me fix

#272 Vezora-Corp opened 1 year ago
0
Training on logits rather than tokens?

#271 SinanAkkoyun opened 1 year ago
0
adding lba support for qlora

#270 itayhubara closed 1 year ago
0
extra memory usage for loading the model

#269 XintianHan opened 1 year ago
0
TypeError: 'NoneType' object is not iterable

#268 reilgun opened 1 year ago
3
DDP Training fails

#267 AntoineBlanot closed 1 year ago
1
[XPU] CUDA error when running on arc770 with Intel extension for pytorch

#266 delock opened 1 year ago
1
can we only evaluate the mmlu_dataset when sh sh scripts/finetune_guanaco_7b.sh?

#265 LiZhangMing opened 1 year ago
1
Could not reproduce the results listed in your paper using a single 3090 card.

#264 LiZhangMing opened 1 year ago
6
[Bug Fix] Add importing `warnings`

#263 tongyx361 opened 1 year ago
0
uneven distribution of GPU workload

#262 liatamax opened 1 year ago
1
Question: CUDA memory usage in the evaluation phase

#261 LimboWK opened 1 year ago
2
Why do we print just half of `trainable_params" when using 4-bits?

#260 HanGuo97 opened 1 year ago
0
[Question] Why can we set `model_parallel` and `is_parallelizable` to `True` for whichever `model`?

#259 tongyx361 opened 1 year ago
0
Why do we need the Dequantization process?

#258 nthehai01 opened 1 year ago
0
Fix outdated description of HF arguments in README.md

#257 tongyx361 opened 1 year ago
0
Error invalid device ordinal at line 393

#255 matt-seb-ho opened 1 year ago
0
Should base model be dequantized when merging LoRA weights with base model?

#254 jinyongyoo opened 1 year ago
6
Getting error dataclasses.FrozenInstanceError: cannot assign to field generation_config when executing any of the scripts in the scripts folder with default parameters.

#253 vasuems opened 1 year ago
2
epoch presented does not match the calculation

#252 lijierui opened 1 year ago
0
[Bug] Test set is taken from training set

#251 Peter-Devine opened 1 year ago
1
can be used in stable diffusion?

#250 henbucuoshanghai opened 1 year ago
0
curious about the train speed

#249 JustQJ opened 1 year ago
0