Request for pretrained models/demo

neoncloud / mdctGAN

Code for INTERSPEECH 2023 paper "mdctGAN: Taming transformer-based GAN for speech super-resolution with Modified DCT spectra"

Other

58 stars 5 forks source link

Request for pretrained models/demo #1

Closed splinter21 closed 1 year ago

splinter21 commented 1 year ago

And if there is a demo page?

neoncloud commented 1 year ago

Thank you for your interest in this project. The pre-trained model will be uploaded soon. If more people want a demo page, we will try to make one.

francqz31 commented 1 year ago

@neoncloud 1- A demo page would be awesome, I was just about to make an issue about it asking for a "Project page" ! 2-Also there is a Super resolution method called aero ( i think that's the only SR method that you didn't compare with) ? it is slow to train but it has almost perfect quality for both vocals/music and speech. I wonder how mdctGAN compares with aero's quality ?? 3-And can mdctGan be trained for singing vocals too or music ? fun fact they might submit nu-wave 3 next month in june as they do that every year , lets see how mdctGAN will compare with it

neoncloud commented 1 year ago

@francqz31

Thank you for your advice! I'm a newbie, so I was wondering what kind of demo people would like? For example a jupyter notebook hosted in Colab? A streamlit program running in Huggingface space? I'm still taking courses and have limited time. So I would like the demo to be as simple yet intuitive as possible.

As for other types of audio, we encourage more researchers to adopt or learn from our approach and look forward to more amazing work in the future.

francqz31 commented 1 year ago

@neoncloud no probelms Mr Chenhao, i think a project page would be enough to listen and review the quality of this method a simple github.io project page like this: 1-https://mindslab-ai.github.io/nuwave2/ 2-https://zkx06111.github.io/wsrglow/ 3-https://pages.cs.huji.ac.il/adiyoss-lab/aero/ if you are talking about an Interactive Demo , colab would be great i my self was making a training colab but when i run !bash ./generate_audio.sh i get this error "-------------- End ---------------- create web directory ./checkpoints/output_folder_name/web... CustomDatasetDataLoader load audio failed"

neoncloud commented 1 year ago

@francqz31 Examples are great👍. I will check them.

generate_audio.sh is for inferrencing not for training. You need to run train.sh to train the network. Please do not discuss topics that are not related to this issue, if you have more questions, please open a new issue. Thank you.