-
@shuaijiang 如标题,在保存微调模型时报错:
Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again: {failing}.
A potential way to correc…
-
### 🐛 Describe the bug
## Describe the bug
`torchdata` does not work with torch 2.3.0 because `DILL_AVAILABLE` is not available where expected:
```
Python 3.12.2 (main, Feb 6 2024, 20:19:44) [C…
-
Refer to https://swift.readthedocs.io/zh-cn/latest/Multi-Modal/qwen2-vl%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5.html
[rank0]: File "/usr/local/lib/python3.10/site-packages/transformers/trainer.py", …
-
We currently don't have a built in way to do sharding for `S3IterableDataset`, so every worker process in a `DataLoader` will see the same stream of objects. We should have a way to do this.
In the…
-
Currently, when there are two device meshes (`tp` and `dp`), torchtitan should choose FSDP as the **only** backend for DP. Ref:
https://github.com/pytorch/torchtitan/blob/d2a4904f58accc683c17c66a3600…
-
## 🐛 Bug
## To Reproduce
Steps to reproduce the behavior:
1.
1.
1.
## Expected behavior
## Environment
- DGL Version (e.g., 1.0):
- Backend Library & Version (e.g., …
-
Currently the release of TorchText 0.18 is blocked by using TorchData 7.1 which is not compatible with PyTorch 2.3 (https://github.com/pytorch/text/actions/runs/8365635868/job/22903939847#step:13:291)…
PaliC updated
6 months ago
-
I tried to run the evaluation using the pretrained checkpoint [PerAct - 512 Latents](https://github.com/peract/peract/releases/download/v1.0.0/peract_600k_512latents.zip) and the val dataset provided.…
-
### Feature request
Add official support for `StatefulDataLoader` as in [torchdata](https://github.com/pytorch/data/tree/main/torchdata/stateful_dataloader) and [datasets](https://huggingface.co/do…
-
### Bug description
Training freezes when using `ddp` on slurm cluster (`dp` runs as expected). The dataset is loaded via torchdata from an s3 bucket. Similar behaviour also arises when using webda…