fsdp Search Results - Githubissues

1000+ results
for fsdp

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/ao #1108

Assertion information on precompute_float8_dynamic_scale_for…

https://github.com/pytorch/ao/blob/main/torchao/float8/fsdp_utils.py#L44-L48 Should it be `raise NotImplementedError("Only supports dynamic scaling")`?

goldhuang updated 1 month ago
1
UKPLab/sentence-transformers #2931

Error in Fully Sharded Data Parallelism (FSDP) set up

Trying to finetune a model whose max seq length is 8k, _BAAI/bge-m3_. I'm trying to finetune on some retrieval task. Here's my trainer set up ```python model = SentenceTransformer(model_id, de…

MohammedAlhajji updated 2 months ago
4
aws-samples/awsome-distributed-training #491

FSDP EKS Example failing with: module 'torch.library' has no…

Following instructions in [HyperPod EKS workshop](https://catalog.workshops.aws/sagemaker-hyperpod-eks/en-US/02-fsdp/02-train), trying to run FSDP EKS example on 2 p5 nodes is failing with the followi…

nghtm updated 1 week ago
1
huggingface/alignment-handbook #169

Cannot flatten integer dtype tensors

Thank you guys for your work! i was using fsdp + qlora fine tuning llama3 70B on 8* A100 80G, and i encountered this error: ```shell Traceback (most recent call last): File "/mnt/209180/qis…

jaywongs updated 1 month ago
2
pytorch/xla #4281

Improve FSDP doc

## 📚 Documentation There were some common questions in FSPD regarding how to wrap the model, how `flatten_parameters` works etc. I think we should add a FAQ section to the https://github.com/pytorc…

JackCaoG updated 1 year ago
2
jzhang38/TinyLlama #173

Why FSDP not DPP？

Could I kindly inquire as to why, given the relatively small size of the tinyllama model, the Strategy was made to utilize FSDP (Fully Sharded Data Parallel) instead of DDP (Distributed Data Parallel)…

noforit updated 3 months ago
1
pytorch/ao #1311

torch.compile(sync_float8_amax_and_scale_history) not workin…

``` [rank0]: File "/opt/venv/lib/python3.10/site-packages/lightning_fabric/wrappers.py", line 411, in _capture [rank0]: return compile_fn(*args, **kwargs) [rank0]: File "/opt/venv/lib/pytho…

goldhuang updated 1 week ago
2
facebookresearch/dinov2 #436

Resume training from intermediate checkpoint?

Hi, I try to resume my training from intermediate checkpoint file with `cfg.MODEL.WEIGHTS` & `no_resume=False` but it didn't work. The checkpointer cannot locate the checkpoint file as there are 8 fil…

JiarunLiu updated 4 months ago
1
Lightning-AI/pytorch-lightning #19721

PyTorch Lightning FSDP takes more memory than PyTorch FSDP

### Bug description The Pytorch Lightining is taking more memory than Pytorch FSDP. I'm able to train the gemma-2b model but it takes 3 times more memory. For openchat it goes out of memory. …

anandhperumal updated 6 months ago
6
jumdc/ADAPT #1

MisconfigurationException("Precision '16-mixed' is invalid. …

Hi, I am running the code and installed the necessary packages as in the requirements.txt. It states pytorch-lightning==1.9.1 and when I run pip show pytorch-lightning in the terminal it shows ``` …

siyuan-zou updated 1 week ago
3

上一页 1...8 9 10 11 12 13 14...100 下一页

1000+ results for fsdp

1000+ results
for fsdp