-
### Summary
Last year, we released [pytorch-labs/torchao](https://github.com/pytorch-labs/ao) to provide acceleration of Generative AI models using native PyTorch techniques. Torchao added support …
-
To repro: `python test/prototype/test_low_bit_optim.py TestFSDP2.test_fsdp2`
Logs
```
- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short…
-
Hello OLMoE team,
I’m currently exploring training scripts for models using Mixture of Experts (MOE) and was wondering if there are any existing or planned scripts that handle expert parallelism du…
-
### 🐛 Describe the bug
With https://github.com/pytorch/pytorch/pull/126578, I am seeing many issues with dynamo tracing of nn.Parameter construction. My PR does not do anything special with nn.Param…
-
### 🚀 The feature, motivation and pitch
share repro for @bdhirsh , @tugsbayasgalan on the gaps of torch.compile for FSDP2 fp8 all-gather
for FSDP2 fp8 all-gather, it's criticial to pre-compute ama…
-
Thank you for sharing the fantastic work.
As I do not have the SLURM cluster, Is there the DDP training code?
Or anyone can help?
-
As someone who used this library for a while in prod, then gave up, I'd honestly recommend just dropping it to simplify the code. There are several issues:
- it isn't being very actively maintaine…
-
### Please check that this issue hasn't been reported before.
- [X] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports.
###…
-
#### I'm attempting to fine-tuning FastChat T5 locally using the command:
torchrun --nproc_per_node=1 --master_port=9778 fastchat/train/train_flant5.py \
--model_name_or_path {my_path}/test_fa…
-
I'm running Llama-2-1.7b-hf +fsdp+xla
but process show ` 523777 517263 0 80 0 - 0 - 10:22 ? 00:00:00 [ptxas] `
I have using gdb to debug process: `517263` ,showing this backtrace…