deepglugs / deep_imagen

scripts for running and training imagen-pytorch
38 stars 8 forks source link

almost noise image generated #8

Open Ishihara-Masabumi opened 1 year ago

Ishihara-Masabumi commented 1 year ago

After training for 100 epochs, I tried to infer with the following command line.

!python imagen.py --imagen model.pth --tags "1girl, red_hair" --output red_hair.png

Then, the generated image of red_hair.png is as follows:

red_hair

So almost noise image! Could you please tell me how to generate red_hair girl image?

deepglugs commented 1 year ago

You probably haven't trained unet2 yet. You can just sample unet1 with --sample_unet=1. You can train unet2 by specifying --train_unet=1 during training.

On Wed, Sep 7, 2022, 17:19 maty0505git @.***> wrote:

After training for 100 epochs, I tried to infer with the following command line.

!python imagen.py --imagen model.pth --tags "1girl, red_hair" --output red_hair.png

Then, the generated image of red_hair.png is as follows:

[image: red_hair] https://user-images.githubusercontent.com/23301778/189006602-371f168c-2b4d-4d99-b3c0-8bc20feabdc9.png

So almost noise image! Could you please tell me how to generate red_hair girl image?

— Reply to this email directly, view it on GitHub https://github.com/deepglugs/deep_imagen/issues/8, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQZQVCF3HTC7KUAEUOOAO2TV5EWHVANCNFSM6AAAAAAQHHG7U4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Ishihara-Masabumi commented 1 year ago

Thank you. Then, the following command line is correct?

python imagen.py --train --source ./datasets --imagen model1.pth --sample_unet 1 --train_unet 1
deepglugs commented 1 year ago

--sample_unet is unnecessary during training. samples will always be produced by the unet under training. It's only used for sampling outside of training.

Also, I made a typo with my first reply. Training unet2 you need --train_unet=2. So:

python imagen.py --train --source ./datasets --imagen model1.pth --train_unet 1 # train unet1 python imagen.py --train --source ./datasets --imagen model1.pth --train_unet 2 # train unet2

python imagen.py --imagen model1.pth --sample_unet=1 --tags "1girl, red_hair" # sample from unet1 python imagen.py --imagen model1.pth --sample_unet=2 --tags "1girl, red_hair" # sample from unet2

Ishihara-Masabumi commented 1 year ago

Thanks! BTW, this time the following image is generated.

red_hair1

It is like something meaningful, but it is very ambiguous. Is this the limit?

deepglugs commented 1 year ago

try lowering --cond_scale to 1.0 or 1.1. This will turn off prompt conditioning, but should give you an idea at the quality your model is capable of at the current training step.

Ishihara-Masabumi commented 1 year ago

I generated the image using the following command line.

python imagen.py --imagen model1.pth --tags "1girl, red_hair" --output red_hair2.png --sample_unet 2 --cond_scale 1.0

The generated image is as follows:

red_hair2

Is it OK?

deepglugs commented 1 year ago

Looks like it needs more training. What does --sample_unet=1 look like?

Ishihara-Masabumi commented 1 year ago

The generated image after 488 epochs training is as follows:

red_hair3

Is this OK for generated image?

deepglugs commented 1 year ago

No. That's a very strange image. Mine look like this after a night of training unet2 (cond_scale 1.1):

imagen_24_663_loss0 128552

After a few epochs, they should look like this early on: imagen_21_90_loss0 754791

So check your data and settings (sample_steps, cond_scale)

Ishihara-Masabumi commented 1 year ago

Thank you for your information. BTW, my training and generating command lines are as follows:

python imagen.py --train --epochs 1000 --source ./datasets --imagen model3.pth --train_unet 2
python imagen.py --imagen model3.pth --tags "1girl, red_hair" --output red_hair3.png --sample_unet 2 --cond_scale 1.1

Is there anything wrong with it?

deepglugs commented 1 year ago

The command looks reasonable. How many images in your dataset?

Ishihara-Masabumi commented 1 year ago

My dataset is the same as the holo dataset from you. The image number is 261, the tag file number is 263.

deepglugs commented 1 year ago

Ahh. You'll need a lot more data. The smallest dataset I've trained has 18k images.

You can try a tag combination closer to what you've trained with, but I would get more data. Maybe several thousand at least. gel_fetch has a "start_id" you can use to pull additional data. Set it to 1+.

Ishihara-Masabumi commented 1 year ago

I tried to fetch images and texts using the following command line. But the result was "added 0 images".

python3 gel_fetch.py --tags "holo2" --txt holo2/tags --img holo2/imgs --danbooru --num 18000 --start_id 1
added 0 images
deepglugs commented 1 year ago

"holo2" tag won't probably have many images. But I think there should be over a thousand for "holo"

On Mon, Sep 12, 2022, 18:17 maty0505git @.***> wrote:

I tried to fetch images and texts using the following command line. But the result was "added 0 images".

python3 gel_fetch.py --tags "holo2" --txt holo2/tags --img holo2/imgs --danbooru --num 18000 --start_id 1 added 0 images

— Reply to this email directly, view it on GitHub https://github.com/deepglugs/deep_imagen/issues/8#issuecomment-1244779411, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQZQVCA7P7ZMRRHSDLJFZ7LV57I4DANCNFSM6AAAAAAQHHG7U4 . You are receiving this because you commented.Message ID: @.***>

Ishihara-Masabumi commented 1 year ago

I'm sorry I had a mistake. Then, I tried to fetch images and datas again as the following command line.

python3 gel_fetch.py --tags "holo" --txt holo2/tags --img holo2/imgs --danbooru --num 18000 --start_id 1

But, the number of images I got was just 263. Please tell me how to get images beyond 1000.

deepglugs commented 1 year ago

Keep going with start id 2 the 3 and so on until you stop getting images.

On Mon, Sep 12, 2022, 18:35 maty0505git @.***> wrote:

I'm sorry I had a mistake. Then, I tried to fetch images and datas again as the following command line.

python3 gel_fetch.py --tags "holo" --txt holo2/tags --img holo2/imgs --danbooru --num 18000 --start_id 1

But, the number of images I got was just 263. Please tell me how to get images beyond 1000.

— Reply to this email directly, view it on GitHub https://github.com/deepglugs/deep_imagen/issues/8#issuecomment-1244791022, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQZQVCA43G4FCSHRNP2IVCTV57K77ANCNFSM6AAAAAAQHHG7U4 . You are receiving this because you commented.Message ID: @.***>

Ishihara-Masabumi commented 1 year ago

Both the following command lines fetched no images, with the message below.

python3 gel_fetch.py --tags "holo" --txt holo2/tags --img holo2/imgs --danbooru --num 18000 --start_id 2
python3 gel_fetch.py --tags "holo" --txt holo2/tags --img holo2/imgs --danbooru --num 18000 --start_id 3

https://cdn.donmai.us/sample/e5/17/sample-e51747e7fa932bcd899f43a6fa12bc3a.jpg
skipping 4445577 because it already exists
https://cdn.donmai.us/sample/62/53/sample-6253c9e5989dae54f1f1a4753b79633d.jpg
skipping 4440381 because it already exists
https://cdn.donmai.us/sample/4f/17/sample-4f172e3dab7cad4b5a1112792546f3fd.jpg
skipping 4422398 because it already exists
https://cdn.donmai.us/sample/02/24/sample-0224f7d54d89ed1e4b5483eda35d467b.jpg
skipping 4422396 because it already exists
https://cdn.donmai.us/sample/e7/ba/sample-e7bacfe3b934a5c46be75befccf3154a.jpg
skipping 4416289 because it already exists
https://cdn.donmai.us/sample/88/76/sample-887641b06505644e11df8e7ee2d4e78d.jpg
skipping 4403609 because it already exists
https://cdn.donmai.us/sample/31/61/sample-3161f75f68d08b7638eb9b439c8166d5.jpg
skipping 4403608 because it already exists
added 0 images
deepglugs commented 1 year ago

Looks like you may have got all of Holo. You can try other tags. "red_hair" will probably get you a lot more.

Ishihara-Masabumi commented 1 year ago

Then, please tell me all the tag names as holo, red_hair, ....

deepglugs commented 1 year ago

There are thousands of tags. Here's some of the most popular:

https://danbooru.donmai.us/tags?commit=Search&search%5Bhide_empty%5D=yes&search%5Border%5D=count

Ishihara-Masabumi commented 1 year ago

Using 2466 images, I tried to train imagen model. After that I tried to generate "1girl, red_hair" image, the image is below.

red_hair4

Is it lack of training images?

zhaobingbingbing commented 1 year ago

Hi,I hope to train 256*256 unet separately. For training, it's like, python imagen.py --train --source --tags_source --imagen yourmodel.pth --train_unet 2 --no_elu For sampling, it's like, python imagen.py --imagen yourmodel.pth --sample_unet 2 --tags "1girl, red_hair" --output ./red_hair.png --cond_scale 1.0 I use 100k image and txt pairs as dataset, and the loss seems right, but I can not generate meaningful images. Do you know the reason. red_hair8

deepglugs commented 1 year ago

Did you train unet1 as well? Usually, you need to train unet1 a lot and then train unet2. So something like:

python imagen.py --train --source dataset --imagen yourmodel.pth --train_unet 1 --no_elu --epochs=80

then

python imagen.py --train --source dataset --imagen yourmodel.pth --train_unet 2 --no_elu --start_epoch=81 --epochs=160
zhaobingbingbing commented 1 year ago

I did not train unet1, but train unet2 separately should be possible. I noticed some tips in lucidrains/imagen-pytorch. image

deepglugs commented 1 year ago

Ahh, okay. I have a commit locally that supports nullunet. I'll push that now.

deepglugs commented 1 year ago

I pushed. There's now a --null_unet1 argument for training and a --start_image for sampling. Sampling during training is not supported, so be sure to use --no_sample.

For sampling I use:

python3 imagen.py --imagen danbooru_320_sr.pth --sample_unet=2 --size 256 --start_image something_64px.png --output something_sr_256.png --tags "1girl, blonde_hair, red_bikini" --cond_scale=1.1 --replace
Ishihara-Masabumi commented 1 year ago

You mean to use --no_sample instead of --start_image. Is it right?

deepglugs commented 1 year ago

--no_sample for training. For sampling (inference), --start_image

Ishihara-Masabumi commented 1 year ago

Then, in your option '--start_image something_64px.png', what and why is something_64px.png?

Ishihara-Masabumi commented 1 year ago

Using your new imagen.py, the following error message occurred.

!python imagen.py --train --source datasets --imagen model5.pth --train_unet 1 --no_sample --no_elu --epochs=80
!python imagen.py --train --source datasets --imagen model5.pth --train_unet 2 --no_sample --no_elu --start_epoch=81 --epochs=160
image_sizes=[64, 64]
The base dimension of your u-net should ideally be no smaller than 128, as recommended by a professional DDPM trainer https://nonint.com/2022/05/04/friends-dont-let-friends-train-small-diffusion-models/
Fetching image indexes in datasets...
2466 images
2485 tags
Traceback (most recent call last):
  File "imagen.py", line 956, in <module>
    main()
  File "imagen.py", line 187, in main
    train(args)
  File "imagen.py", line 647, in train
    use_text_encodings=args.embeddings is not None)
TypeError: __init__() got an unexpected keyword argument 'styles'
image_sizes=[64, 64]
The base dimension of your u-net should ideally be no smaller than 128, as recommended by a professional DDPM trainer https://nonint.com/2022/05/04/friends-dont-let-friends-train-small-diffusion-models/
Fetching image indexes in datasets...
2466 images
2485 tags
Traceback (most recent call last):
  File "imagen.py", line 956, in <module>
    main()
  File "imagen.py", line 187, in main
    train(args)
  File "imagen.py", line 647, in train
    use_text_encodings=args.embeddings is not None)
TypeError: __init__() got an unexpected keyword argument 'styles'
zhaobingbingbing commented 1 year ago

You can just delete the line 647 and have a new try.

Ishihara-Masabumi commented 1 year ago

After deleteing line 647, the following error message occurd.

image_sizes=[64, 64]
The base dimension of your u-net should ideally be no smaller than 128, as recommended by a professional DDPM trainer https://nonint.com/2022/05/04/friends-dont-let-friends-train-small-diffusion-models/
Fetching image indexes in datasets...
2466 images
2485 tags
Traceback (most recent call last):
  File "imagen.py", line 956, in <module>
    main()
  File "imagen.py", line 187, in main
    train(args)
  File "imagen.py", line 646, in train
    no_preload=True)
TypeError: __init__() got an unexpected keyword argument 'styles'
image_sizes=[64, 64]
The base dimension of your u-net should ideally be no smaller than 128, as recommended by a professional DDPM trainer https://nonint.com/2022/05/04/friends-dont-let-friends-train-small-diffusion-models/
Fetching image indexes in datasets...
2466 images
2485 tags
Traceback (most recent call last):
  File "imagen.py", line 956, in <module>
    main()
  File "imagen.py", line 187, in main
    train(args)
  File "imagen.py", line 646, in train
    no_preload=True)
TypeError: __init__() got an unexpected keyword argument 'styles'
zhaobingbingbing commented 1 year ago

Try to delete all those lines with errors until it can work.

Ishihara-Masabumi commented 1 year ago

That's OK. BTW, what and why is something_64px.png in '--start_image something_64px.png'? I don't have --start_image something_64px.png'.

deepglugs commented 1 year ago

That's OK. BTW, what and why is something_64px.png in '--start_image something_64px.png'? I don't have --start_image something_64px.png'.

You can pick any image you want as your start image. Just resize it to 64x64 (although, it'll probably work with something bigger if you have more memory).

Ishihara-Masabumi commented 1 year ago

Hi, I have 2 error messanges. The first error is as follows:

!python imagen.py --imagen model5.pth --sample_unet=2 --size 256 --start_image girl.png --output girl_256.png --tags "1girl, blonde_hair" --cond_scale=1.1 --replace

usage: imagen.py [-h] [--source SOURCE] [--tags_source TAGS_SOURCE]
                 [--cond_images COND_IMAGES] [--style STYLE]
                 [--embeddings EMBEDDINGS] [--tags TAGS] [--vocab VOCAB]
                 [--size SIZE] [--sample_steps SAMPLE_STEPS]
                 [--num_unets NUM_UNETS] [--vocab_limit VOCAB_LIMIT]
                 [--epochs EPOCHS] [--imagen IMAGEN] [--output OUTPUT]
                 [--replace] [--unet_dims UNET_DIMS] [--unet2_dims UNET2_DIMS]
                 [--dim_mults DIM_MULTS] [--start_size START_SIZE]
                 [--sample_unet SAMPLE_UNET] [--device DEVICE]
                 [--text_encoder TEXT_ENCODER] [--cond_scale COND_SCALE]
                 [--no_elu] [--num_samples NUM_SAMPLES]
                 [--init_image INIT_IMAGE] [--skip_steps SKIP_STEPS]
                 [--sigma_max SIGMA_MAX] [--full_load] [--no_memory_efficient]
                 [--print_params] [--unet_size_mult UNET_SIZE_MULT]
                 [--self_cond] [--batch_size BATCH_SIZE]
                 [--micro_batch_size MICRO_BATCH_SIZE]
                 [--samples_out SAMPLES_OUT] [--train] [--train_encoder]
                 [--shuffle_tags] [--train_unet TRAIN_UNET]
                 [--random_drop_tags RANDOM_DROP_TAGS] [--fp16] [--bf16]
                 [--workers WORKERS] [--no_text_transform]
                 [--start_epoch START_EPOCH] [--no_patching]
                 [--create_embeddings] [--verify_images]
                 [--pretrained PRETRAINED] [--no_sample] [--lr LR]
                 [--loss LOSS] [--sample_rate SAMPLE_RATE] [--wandb] [--is_t5]
                 [--webdataset]
imagen.py: error: unrecognized arguments: --start_image girl.png

That is no --start_image option.

The second error is as follows:

!python imagen.py --imagen model5.pth --sample_unet=2 --size 256 --output girl_256.png --tags "1girl, blonde_hair" --cond_scale=1.1 --replace

The base dimension of your u-net should ideally be no smaller than 128, as recommended by a professional DDPM trainer https://nonint.com/2022/05/04/friends-dont-let-friends-train-small-diffusion-models/
loading non-EMA version of unets
image sizes: [64, 256]
Traceback (most recent call last):
  File "imagen.py", line 956, in <module>
    main()
  File "imagen.py", line 189, in main
    sample(args)
  File "imagen.py", line 264, in sample
    stop_at_unet_number=args.sample_unet)
  File "/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/imagen_pytorch/imagen_pytorch.py", line 100, in inner
    out = fn(model, *args, **kwargs)
TypeError: sample() got an unexpected keyword argument 'sigma_max'
deepglugs commented 1 year ago

You may need to update imagen-pytorch and pull deep-imagen again.

deepglugs commented 1 year ago

No. It should get better with more training. Also try lowering cond_scale. --cond_scale=1.0 for best quality (but almost no prompt condition). 10 for best conditioning (but maybe worse quality).

On Wed, Sep 7, 2022, 23:50 maty0505git @.***> wrote:

Thanks! BTW, this time the following image is generated.

[image: red_hair1] https://user-images.githubusercontent.com/23301778/189053802-2a26a6c0-bfc5-456f-9e96-ccddca2e4703.png

It is like something meaningful, but it is very ambiguous. Is this the limit?

— Reply to this email directly, view it on GitHub https://github.com/deepglugs/deep_imagen/issues/8#issuecomment-1240297910, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQZQVCHNNL3IXHFAHZCNJMDV5GEEJANCNFSM6AAAAAAQHHG7U4 . You are receiving this because you commented.Message ID: @.***>