Open kaiw7 opened 8 months ago
Hi, we directly use the pretrained VAE of Stable Diffusion.
Hi, we directly use the pretrained VAE of Stable Diffusion.
Hi, thank you very much for your quick reply. Could I know when you will release the data pre-processing and training scripts? or it would be appreciated if you could tell me how to refer to these two scripts.
Hi, we directly use the pretrained VAE of Stable Diffusion.
Hi, could I know when you will release the complete scripts including model training? Many thanks.
Hi, thanks for your attention! We have decided to release our complete scripts after acceptance. However, our training scripts are adapted from diffusers training scripts, and the data pre-processing scripts can refer to data convert scripts in our project. You may try to adapt these scripts and train your models.
Hi, thanks for your attention! We have decided to release our complete scripts after acceptance. However, our training scripts are adapted from diffusers training scripts, and the data pre-processing scripts can refer to data convert scripts in our project. You may try to adapt these scripts and train your models.
Hi, could you share with me which functions/classes in 'convert scripts' are used for training? I think not all the functions in convert scripts are used.
Hi, thanks for your attention! We have decided to release our complete scripts after acceptance. However, our training scripts are adapted from diffusers training scripts, and the data pre-processing scripts can refer to data convert scripts in our project. You may try to adapt these scripts and train your models.
Hi, could you please release the script about how to obtain the audio mel-spectrogram and normalize it for model training? Because there are many functions so that we don't know which ones are used for preparing the audio data. Many thanks.
obtain the audio mel-spectrogram https://github.com/happylittlecat2333/Auffusion/blob/f44233d1d0f6444653606b6189e090e999d79656/converter.py#L177 normalize it for model training https://github.com/happylittlecat2333/Auffusion/blob/f44233d1d0f6444653606b6189e090e999d79656/converter.py#L109
Hi, do you directly use the pre-trained VAE in LDM? Or the VAE is first pre-trained on audio spec? Thank you very much.