Nota-NetsPresso / BK-SDM

A Compressed Stable Diffusion for Efficient Text-to-Image Generation [ECCV'24]
Other
238 stars 16 forks source link

Discussion on experimental settings #34

Closed bokyeong1015 closed 12 months ago

bokyeong1015 commented 12 months ago

[Inquiry]

hi, I tried this method, but found that the performance was very poor. My experimental configuration was to train on laion_11k data for 10k steps, and the unet is bk_tiny. And I also replaced the pipeline to inpainting and the input data. I would like to ask you for any good suggestions, thanks.

bokyeong1015 commented 12 months ago

@yajieC, Hi, thank you for trying out our method.

Firstly, we would like to clarify that we provided the LAION_11k dataset primarily to test runnability and to determine the optimal settings for your machines. This dataset is not intended to yield the best performance.

Our suggestions are two-fold:


We're uncertain if your latter mention (I also replaced the pipeline to inpainting and the input data) is independent from the aforementioned mention (train on laion_11k data for 10k steps) or if you have attempted the finetuning process for inpainting.

We believe several factors are critical for inpainting finetuning: code modifications (e.g., changes in input channels), learning hyperparameters (e.g., learning rate), and the quality of input data. We haven't scheduled inpainting experiments in our current plan, so we're unable to provide further insights on this topic. Please kindly understand that.

yajieC commented 12 months ago

Thank you for your suggestion

Bikesuffer commented 12 months ago

I can share my experience. I used the same hyperparameters(lr etc.) to train the inpainting. It was trained on Laion-212K SD_Tiny did have bad results compare to SD_Base and SD_Small Here attached the feat_loss for your reference. 64 or 256 indicate the batch size of the training. Let me know if you have more questions.

feat train loss

yajieC commented 12 months ago

Hi, I increased the dataset(laion_212k) and the number of iteration steps(50k), but the results were still unacceptable. Is it convenient for you to share your pipeline? Thanks.

bokyeong1015 commented 12 months ago

@yajieC Hi, we're unsure which pipeline you're referring to (either text-to-image or inpainting). For our text-to-image models, please refer to the following links:

For models fine-tuned for inpainting, please understand that we haven't conducted and planned inpainting experiments.

yajieC commented 12 months ago

I'm so sorry for this confusion, I was talking about the inpainting pipeline.

yajieC commented 11 months ago

Thank you for sharing your idea, I solved the problem, and I got acceptable result too. Thank you!