Batch size: "stack expects each tensor to be equal size"

aria1th / Hypernetwork-MonkeyPatch-Extension

Extension that patches Hypernetwork structures and training

116 stars 10 forks source link

Batch size: "stack expects each tensor to be equal size" #7

Closed Arilziem closed 1 year ago

Arilziem commented 1 year ago

Trying to train with the gamma trainer and the new gradient accumulation, I get an error using gradient accumulation steps of 8 and a batch size of 4 (I think, something along those values). Works fine with a batch size of 1.

File "[path]\patches\external_pr\dataset.py" , line 171, in __init__ self.latent_sample = torch.stack([entry.latent_sample for entry in data]).squeeze(1) RuntimeError: stack expects each tensor to be equal size, but got [4, 48, 64] at entry 0 and [4, 64, 48] at entry 1

aria1th commented 1 year ago

Currently train gamma option does not support bigger batch sizes, since its not forcing resize. Maybe someday it can have strict batches with same-image sizes...

Arilziem commented 1 year ago

Ah that makes sense. Entirely impossible with varying aspect ratios? Or would there be a way when one side has same length? Like batching together all 512 px width, and then batching all 512 px height?

aria1th commented 1 year ago

There could be hacky workaround, like 'concat', but I don't think it could help in good way. Best solution would be having cropped version and non-cropped version maybe... or having batch of same images - (maybe some tweaks), but well...

Arilziem commented 1 year ago

With same images you mean putting images with same aspect ratio together? I think having something like 3 fixed aspect ratios e.g. 1:1, 3:2, 2:3 would still be pretty good. Cropping to the nearest aspect ratio in preprocessing would be pretty straight forward as well.

Panchovix commented 1 year ago

Someone added batch size >1 for mixed-sized images on the default webui training, could be this applied to MonkeyPatch as well?

https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/6620

aria1th commented 1 year ago

@Panchvzluck can you try https://github.com/aria1th/Hypernetwork-MonkeyPatch-Extension/tree/beta-apply-bigger-batch-sizes ? If the implementation is correct, this should work, although I didn't have enough time to check everything is working correctly.

Panchovix commented 1 year ago

@aria1th tried and sadly it says "list index out of range" On the default embeddings webui training, it does look like this with batch size 5 and GA 5

On Monkeypatch, it says the list issue

aria1th commented 1 year ago

@Panchvzluck I fixed seemingly related part, not sure why its happening... If its not fixed, I'll be able to test it and check it further within 30 minutes

Panchovix commented 1 year ago

@aria1th it works! I can use batch size 20 and GA 5 on my 4090 without issues now.

Really appreciated, and the results are pretty accurate!

aria1th commented 1 year ago

Good! I'll merge it into main branch. @Arilziem please check the feature, and reopen the issue if some problem happens!

Arilziem commented 1 year ago

Well it barely fits into my 10GB VRAM with a batch size of 2 (9.8GB dedicated and 1.1GB shared cough) but it does give me a nice little speed bump from around 1.3 it/s to ~ 1.8-2 it/s compared to using gradient accumulation of 2. Thanks!

A bit less VRAM usage would be great of course (in the dreambooth extension I can go to a batch size of 4, no idea how well that compares though). I assume you already did what is possible in terms of unloading stuff and so on.