-
Thanks for your excellent work and code. I have some questions about the code.
1. when ``stacking == True`` and ``zero_padding == True``?
https://github.com/ATP-1010/FederatedLLM/blob/8efe72d1a7c7…
-
- [ ] [DeepSeek-V2: A Strong, Economical, and Efficient MoE LLM of 236B total parameters](https://github.com/deepseek-ai/DeepSeek-V2)
# DeepSeek-V2: A Strong, Economical, and Efficient MoE LLM of 2…
-
https://lightning.ai/pages/community/finetuning-falcon-efficiently/
-
# URL
- https://arxiv.org/abs/2306.09782
# Affiliations
- Kai Lv, N/A
- Yuqing Yang, N/A
- Tengxiao Liu, N/A
- Qinghui Gao, N/A
- Qipeng Guo, N/A
- Xipeng Qiu, N/A
# Abstract
- Large Lan…
-
**Command: tune run lora_finetune_single_device --config llama3_1/8B_lora_single_device**
**Output**:
```
INFO:torchtune.utils._logging:Running LoRAFinetuneRecipeSingleDevice with resolved config:…
-
Hi @NielsRogge
I have finetuned my paligemma for custom data for image to JSON use case, but when I inference it some key values I got wrong like 3000 is extracted as 9000 so to get the data is corr…
-
Questions we want to answer:
- Should we use pre-trained embeddings or the whole model?
- Why or why not?
- What are some major fine-tuning strategies and what are their benefits and drawbacks?
Rela…
-
Can we fine-tune with Mistral for a custom dataset in the field of digital marketing/marketing communication?
-
I want to run [sft](https://github.com/huggingface/peft/tree/main/examples/sft) example and I get some erros, Can you help me to find the problem?
I run [run_peft_fsdp.sh](https://github.com/huggin…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
- `llamafactory` version: 0.9.2.dev0
- Platform: Linux-5.15.0-1070-aws-x86_64-with-glibc2.31
- Python v…