-
I can finetune a pretrained model using two nodes outside docker by several steps to get better precision. But when I do the same thing in the docker( without any error messages) by several steps in t…
-
https://godbolt.org/z/vovWPxexE
## Expected output
Firstly, the expected "canonical" output which we got both for `_BitInt(512)` additions and `__builtin_addcll` looks as follows:
```cpp
using…
-
## Description
Verify that we are able to fine-tune a model with multi-gpu sharding (fsdp) with bf16.
## Discussion
Provide detailed discussion here
## Acceptance Criteria
- [ ] Uni…
-
Hello, thank you for your excellent work. When I was training faceid_plus, as the training step increased, the generated images became worse. I used about 1,500,000 images for training. What may be th…
-
Right now ``explain`` only does a precision-recall curve for binary classification, but not for multi-class classification. We should also do it for multi-class classification, I think.
-
I use the default train.sh as
accelerate launch --mixed_precision='bf16' scripts/train.py
--pretrained_model_name_or_path=$MODEL_NAME
--train_data_dir=$DATASET_NAME
--train_data_meta=$DATASET_META…
-
Here is what I have for the 960x960 training script. Do you need to change video_sample_size and image_sample_size to 960 to train the 960x960 model?
```
export MODEL_NAME="models/Diffusion_Transf…
-
This issue will span picking up and placing samples.
-
Hi there
so SD Training on 1 GPU Works just fine
but as soon as i enable multi gpu with 2 GPUs i get this error:
![Clipboard_08-18-2024_01](https://github.com/user-attachments/assets/8dc2bd36-ddc…
-
Enable creating engines (currently, MXR files, eventually perhaps dynamic objects) without embedding the weights in the engine.
Use cases:
(1) Support compilation for various batch sizes without …