nicolai256 / Stable-textual-inversion_win

MIT License
241 stars 43 forks source link

no module named taming #2

Open 1blackbar opened 1 year ago

1blackbar commented 1 year ago

why is that ? i ran everything like in instructions

nicolai256 commented 1 year ago

try this? pip install taming-transformers

nicolai256 commented 1 year ago

should run fine with just the instructions tho.. multiple people have tested it and it worked. did you install stable diffusion before installing this? if not do that..

nicolai256 commented 1 year ago

added some extra things to avoid confusion

1blackbar commented 1 year ago

got it to run but used someone eleses env with GUi then moved your python scripts to it, I gout kicked out for out of memory, so is this thing needs at least 10 gb even for 2 images ? cant someone otpimize the code like with model ? I might try colab im on gtx 1080ti 11gb but it still flips out.

Oh i can see that it creates 16 (!!!!) working processess , what the heck, is this for 3090 only? Where do i change that ?

OK its in v1-finetune.yaml file under num_workers

OK simply lowering worker num wont work, any clue what to lower next for 11gb card to get biggest improvement ? ill tinker with it in meantime

-- Allright got it working !!! i had to lower workers , batch size to 1 and max_images to 5

nicolai256 commented 1 year ago

@1blackbar can u send me the config file? i'll put it in my repo send me on discord @MATRIXMANE#1219

1blackbar commented 1 year ago

I added you, in meantime


model: base_learning_rate: 5.0e-03 target: ldm.models.diffusion.ddpm.LatentDiffusion params: linear_start: 0.00085 linear_end: 0.0120 num_timesteps_cond: 1 log_every_t: 200 timesteps: 1000 first_stage_key: image cond_stage_key: caption image_size: 64 channels: 4 cond_stage_trainable: true # Note: different from the one we trained before conditioning_key: crossattn monitor: val/loss_simple_ema scale_factor: 0.18215 use_ema: False embedding_reg_weight: 0.0

personalization_config:
  target: ldm.modules.embedding_manager.EmbeddingManager
  params:
    placeholder_strings: ["*"]
    initializer_words: ["sculpture"]
    per_image_tokens: false
    num_vectors_per_token: 1
    progressive_words: False

unet_config:
  target: ldm.modules.diffusionmodules.openaimodel.UNetModel
  params:
    image_size: 32 # unused
    in_channels: 4
    out_channels: 4
    model_channels: 320
    attention_resolutions: [ 4, 2, 1 ]
    num_res_blocks: 2
    channel_mult: [ 1, 2, 4, 4 ]
    num_heads: 8
    use_spatial_transformer: True
    transformer_depth: 1
    context_dim: 768
    use_checkpoint: True
    legacy: False

first_stage_config:
  target: ldm.models.autoencoder.AutoencoderKL
  params:
    embed_dim: 4
    monitor: val/rec_loss
    ddconfig:
      double_z: true
      z_channels: 4
      resolution: 256
      in_channels: 3
      out_ch: 3
      ch: 128
      ch_mult:
      - 1
      - 2
      - 4
      - 4
      num_res_blocks: 2
      attn_resolutions: []
      dropout: 0.0
    lossconfig:
      target: torch.nn.Identity

cond_stage_config:
  target: ldm.modules.encoders.modules.FrozenCLIPEmbedder

data: target: main.DataModuleFromConfig params: batch_size: 1 num_workers: 10 wrap: false train: target: ldm.data.personalized.PersonalizedBase params: size: 512 set: train per_image_tokens: false repeats: 100 validation: target: ldm.data.personalized.PersonalizedBase params: size: 512 set: val per_image_tokens: false repeats: 10

lightning: callbacks: image_logger: target: main.ImageLogger params: batch_frequency: 500 max_images: 15 increase_log_steps: False

trainer: benchmark: True max_steps: 6100


cant put it in <> cause reasons

this is my yaml, im sure it can be modified but i dont think it matters how much batchsize you put in cause your gpu is working on max anyway depending if you throw at it big images or smaller , it will get to the end in the same time