Open AIRenaissance opened 1 year ago
ran "pip install bitsandbytes-cuda117" so it fits my GPU, but it still did not train my model
I am experiencing the same thing on both this fork and the huggingface/diffusers repo.
I also have this issue. The training runs flawlessly...but nothing actually happens. I set up this repository on a new machine last week in WSL2 Ubuntu 20.04, so it may have been a recent change.
@kudou-reira Do you know the commit of the last working version?
@kudou-reira Do you know the commit of the last working version?
main 47f456e [origin/main] Update to script for ckpt conversion of 2.0 models (#169)
is the commit that works on a different machine. I did a rollback on my newer machine to that commit and am currently running a training. We'll see if it works.
I also had this issue.
It works perfectly fine on colab but not on my machine. Colab ussed CUDA 11.6 while I am using CUDA 11.7 aswell.
It appears that your PyTorch version in combination with your CUDA version does not work with xformers
.
An explanation is given here: https://github.com/facebookresearch/xformers/issues/631#issuecomment-1414421325
What worked for me with the same PyTorch and CUDA versions:
pip uninstall xformers
pip install xformers==0.0.17.dev447
In my case I am not using xformers so there's something else going on.
I reinstalled new xformers, but my trained model is still giving incorrect results. The problem before was that the model was finetuned, but did nothing. The new problem is that the finetuned image now permeates the entire model pretty badly even though I have a sufficient number of regularization images.
I also had this issue. It works perfectly fine on colab but not on my machine. Colab ussed CUDA 11.6 while I am using CUDA 11.7 aswell. It appears that your PyTorch version in combination with your CUDA version does not work with
xformers
.An explanation is given here: facebookresearch/xformers#631 (comment)
What worked for me with the same PyTorch and CUDA versions:
pip uninstall xformers
pip install xformers==0.0.17.dev447
I have CUDA 11.7 installed on the host and using PyTorch 1.13.1+cu117. Does that match what you are using?
I also installed xformers and am using it in my script, but it does not make a difference.
Describe the bug
My Diffusers is running but it just doesn't want to train a model based on my settings and I don't know why.
It also does not generate any class images whatsoever so it seems he doesn't even train the existing model.
Also tried different models, no one worked.
What could it be? What should I change or try?
Reproduction
This is my train.sh file:
export MODEL_NAME="dreamlike-art/dreamlike-diffusion-1.0" export INSTANCE_DIR="training" export CLASS_DIR="classes" export OUTPUT_DIR="output"
accelerate launch train_dreambooth.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --instance_data_dir=$INSTANCE_DIR \ --class_data_dir=$CLASS_DIR \ --output_dir=$OUTPUT_DIR \ --instance_prompt="photo of yface1 person" \ --class_prompt="photo of a person" \ --resolution=512 \ --train_batch_size=1 \ --mixed_precision="fp16" \ --use_8bit_adam \ --gradient_accumulation_steps=1 --gradient_checkpointing \ --learning_rate=5e-6 \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --num_class_images=200 \ --max_train_steps=800
Logs
System Info
diffusers
version: 0.13.0.dev0