-
**Describe the bug**
As shown in [this notebook](https://gist.github.com/josephrocca/9ec65e8e5804286a475b5b6da85f7a28), I run these commands:
```py
pip install deepspeed --upgrade
git clone https:…
-
**Describe the bug**
When I run train.py (stage I), I get the following error: `RuntimeError: derivative for aten::grid_sampler_2d_backward is not implemented`
This happens in [accumulate_gradient…
-
### Describe the bug
12/07/2023 07:37:24 - INFO - __main__ - ***** Running training *****
12/07/2023 07:37:24 - INFO - __main__ - Num examples = 833
12/07/2023 07:37:24 - INFO - __main__ - Num …
-
Hi,I have been deployed pytorch-operator for distributed training on k8s cluster, and struggled with this issue for a while.
Here is my yaml.
( my k8s can only schedule two nodes, named gpu-233 and…
-
### Issue Description
I have waited some releases, but i can't get the API docs working. Did a clean install several times including fresh virtual environments, but after starting the `/docs` url s…
-
COMMAND:
python3 run_facial_editing.py --source_path ./images/selfie4.jpg --output_path ./output/facial_editing --directions 0 1 2 3 4 --image_resolution 1024 --dataset_type ffhq --save_images --opti…
-
**Describe the bug**
Hi,
I'm trying to use a gpu system on our local network. However I'm running into issues.
Basic question: Does the runhouse package need to be installed on the remote gpu syst…
-
# stable diffusion
>Stable Diffusion是一个非常实用的AI绘画工具,它的免费开源性和高效实用性为用户提供了更多的可能性
>[Stability-AI/stablediffusion: High-Resolution Image Synthesis with Latent Diffusion Models (github.com)](https://github…
-
Hi I am trying to run the model using on lambdalabs GPUs instances a10 and h100, However I am facing every time OutOfMemoryError on both of them.
```python
torch.cuda.OutOfMemoryError: CUDA out of…
-
### Describe the bug
Hello 🤗.
I'm trying to train a LoRA SDXL on Google Cloud TPU v3-8 machine with this script: [train_text_to_image_lora_sdxl.py](https://github.com/huggingface/diffusers/blob/c8…