fsdp Search Results - Githubissues

1000+ results
for fsdp

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/xla #5089

check failed ShapeUtil::Compatible error for FSDP job

I recently implemented my own model using torch xla FSDP on GPU, but encountered an error message: "Check failed: ShapeUtil::Compatible". `2023-05-26 17:36:08.508196: F external/org_tensorflow/ten…

anw90 updated 10 months ago
7
pytorch/xla #7987

Speeding up computation while using SPMD on large TPU pod

## ❓ Questions and Help When running on vp-128 TPU pod (even when sharding only by batch dimension) we are experiencing very low performance comparing to the same pod without SPMD. Do you have any…

dudulightricks updated 3 weeks ago
3
tatsu-lab/stanford_alpaca #317

SFT Mistral；

torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

feiying12343 updated 4 days ago
2
pytorch/xla #6766

How to implement parrallel training across TPU device with X…

I found the latest opensource LLM from google: Gemma has two version of model structure. 1. https://github.com/google/gemma_pytorch/blob/main/gemma/model_xla.py 2. https://github.com/google/gemma_…

Mon-ius updated 6 months ago
13
axolotl-ai-cloud/axolotl #1534

OPRO, DPO don't work with Mixtral-8x22B. FSDP + QLORA & bigs…

### Please check that this issue hasn't been reported before. - [X] I searched previous [Bug Reports](https://github.com/OpenAccess-AI-Collective/axolotl/labels/bug) didn't find any similar reports. …

0-hero updated 5 months ago
1
pytorch/pytorch #101443

Enhance FSDP debugability

### 🚀 The feature, motivation and pitch Some items we can add under torch_distributed_debug mode to improve debugability of FSDP: - Shared parameter detection - Logging when backward hooks are fi…

rohan-varma updated 1 year ago
2
anarchy-ai/LLM-VM #218

Implement FSDP for training large datasets

Definition of done: Implement training large models using FSDP to accelerate training on large datasets. Reference: https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/

TheRealVish updated 7 months ago
1
facebookresearch/dinov2 #265

PyTorch 2.1 support

Looks like there are some breaking changes to the FSDP API in PyTorch 2.1. For example, `dinov2.fsdp.__init__.py::free_if_fsdp` is broken when using torch==2.1: `AttributeError: 'DinoVisionTransfor…

schmidt-ai updated 11 months ago
3
meta-llama/llama-recipes #655

FP8 support for training

### 🚀 The feature, motivation and pitch Is there a plan to add FP8 support for training? ### Alternatives _No response_ ### Additional context _No response_

mathmax12 updated 2 weeks ago
2
axolotl-ai-cloud/axolotl #1191

FSDP Full-finetuned Model params and weights are NAN

### Please check that this issue hasn't been reported before. - [X] I searched previous [Bug Reports](https://github.com/OpenAccess-AI-Collective/axolotl/labels/bug) didn't find any similar reports. …

hahmad2008 updated 7 months ago
10

上一页 1...20 21 22 23 24 25 26...100 下一页

1000+ results for fsdp

1000+ results
for fsdp