fsdp Search Results - Githubissues

1000+ results
for fsdp

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/accelerate #2963

Multinode training ValueError: Inconsistent compute device …

Hello, I am trying to finetune a llama3.1 on my custom dataset. I have access to a 2 nodes cluster with 4 gpus on each cluster. I am pretty new to finetuning on a multi node cluster. With whatever …

sammed-kamboj updated 1 month ago
4
pytorch/pytorch #71303

[RFC] Cross-Process Performance Analysis: Straggler Detectio…

### 🚀 The feature, motivation and pitch ## Motivation: Limitation of Existing Profiling Approach To conduct PyTorch distributed training performance analysis, currently a recommended way is profil…

wayi1 updated 2 years ago
5
pytorch/torchtitan #497

compiled rms_norm not numerically accurate (fails to produce…

When doing large scale runs, found that compiled_rmsnorm was producing aberrant loss curves compared to tp or async tp with rmsnorm. Verified this reproes with small scale and thus opening issue for …

lessw2020 updated 1 month ago
3
OpenMOSS/MOSS #272

ZeRORuntimeException: You are using ZeRO-Offload with a clie…

显卡配置：2张 V100 32G （共四张，有两张别人占用中，用完后可实现利用4卡V100）按照默认accelerate配置报错：cuda out of memory，观察发现默认配置中 offload_optimizer_device 和 offload_param_device 参数均为none，后按照accelerate教程，将这两个参数均改成 cpu 报错： ![image](h…

Daniel-1997 updated 1 year ago
3
microsoft/simulated-trial-and-error #14

Error Occurring at loss.backward() Despite Loss Being Calcul…

Hello, I am encountering an issue when running the following code snippet: CUDA_VISIBLE_DEVICES=0,1,2,3` torchrun --nnodes 1 --nproc_per_node 4 llama_finetuning.py \ --enable_fsdp \ --…

aidarikako updated 4 months ago
2
karpathy/nano-llama31 #7

Contribution ideas

Hi, I was looking to open source contribute this repo, I saw these in TODO: “think through support for Llama 3 models > 8B in size” “make finetuning more full featured, more similar to nanoGPT (mi…

notlober updated 1 month ago
1
hiyouga/LLaMA-Factory #3853

FSDP + Qlora Faill

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction I pulled new code and ran Accelerate +FSDP + Qlora training, but encountered an error: ![image](https:/…

tanghui315 updated 1 month ago
2
coqui-ai/STT #2348

Feature request: Replace Scorer.KenLM with Scorer.Transform

Let’s face it. KenLM has served us well… …but it has its limitations. It didn’t aged well as a language model architecture. First order of business is to compute a bi directional vector representa…

wasertech updated 1 year ago
18
lm-sys/FastChat #1296

How to fine tune vicuna-7b with A40

How to fine tune vicuna-7b with A40

yqh984638220 updated 11 months ago
9
philschmid/llm-sagemaker-sample #4

Having a greater chunk length than 2048 in packing leads to …

Hi @philschmid, When I try to increase the chunk length to be greater than 2048, the training fails and runs into an OOM error on g5.4xlarge. Totally makes sense why it's happening, my question i…

abhimasand updated 11 months ago
16

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for fsdp

1000+ results
for fsdp