peft Search Results - Githubissues

1000+ results
for peft

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

triton-lang/triton #2404

issue while using Triton to finetune a 4-bit model on multip…

Hi, I encountered an issue while using Triton for LoRa finetuning of mpt-storywriter-4bit. The problem occurs when the program reaches the following line of code: ```python self.fn.run(*args, num_…

01miaom updated 1 month ago
2
unslothai/unsloth #533

`model.eval(); model.train()` makes backward pass undifferen…

If pad tokens are used, and `model.eval(); model.train()` is called, Unsloth backward pass is undifferentiable, resulting in `nan`. Reproduction script (expand): ``` import torch from transf…

lapp0 updated 3 months ago
6
evo-design/evo #85

frameworks for FSDP and model/pipeline parallelism

Hello, Along the issue here https://github.com/evo-design/evo/issues/11 which discusses finetuning codes for Evo, I am specifically looking for information on which frameworks could be used to opti…

adrienchaton updated 1 month ago
14
ztxz16/fastllm #298

微调后的GhatGLM2-6B模型导出flm格式报错

模型导出：model = AutoModel.from_pretrained(xxx) model = llm.from_hf(model, tokenizer, dtype = "float16") model.save(xxx) 模型载入：llm.model(xxx) 报…

Rorschaaaach updated 12 months ago
3
embeddings-benchmark/mteb #1386

[mieb] InfoSeekIT2ITRetrieval & InfoSeekIT2TRetrieval fail w…

Running the tasks with `BAAI/bge-visualized-base-base/m3` and getting errors like below ``` ERROR:mteb.evaluation.MTEB:Error while evaluating InfoSeekIT2TRetrieval: The size of tensor a (516) must m…

Muennighoff updated 3 weeks ago
2
unslothai/unsloth #617

Random Training

Hi authors, In the SFTTrainer, we set "seed = 3407". But I find the training procedure is still random. the performance of test dataset and the change of loss are different under same configs. …

x6p2n9q8a4 updated 4 months ago
12
Zhu-H-Y/RealScienceComments #2

LoRA: Low-Rank Adaptation of Large Language Models

# LoRA: Low-Rank Adaptation of Large Language Models [https://real-science.vercel.app/lora-low-rank-adaption-of-large-language-models](https://real-science.vercel.app/lora-low-rank-adaption-of-larg…

utterances-bot updated 9 months ago
1
FlagOpen/FlagEmbedding #609

transformers version when fine-tuning

I'm trying to fine-tune BGE-M3 based on the README here: https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune I originally started with the latest transformers version a few week…

g-karthik updated 8 months ago
2
PaddlePaddle/PaddleMIX #645

PaddleMIX 2.0版本发布，欢迎开发者使用反馈

## PaddleMIX 2.0版本正式发布 https://github.com/PaddlePaddle/PaddleMIX/tree/v2.0.0 * 多模态理解：新增LLaVA系列,Qwen-VL等；新增Auto模块统一SFT训练流程；新增mixtoken训练策略，SFT吞吐量提升5.6倍。 * 多模态生成：发布[PPDiffusers 0.24.1](./ppdiffusers…

jerrywgz updated 2 months ago
6
mikeizbicki/modulus-magnus-linguae #40

Think I found issue

https://github.com/huggingface/peft/issues/286 This issue, which presents a problem with how Alpaca Lora saved models, hopefully, is what is causing the problems. I also see the final adapter.bin f…

irajmoradi updated 1 year ago
1

上一页 1...84 85 86 87 88 89 90...100 下一页

1000+ results for peft

1000+ results
for peft