DanBigioi / DiffusionVideoEditing

Official project repo for paper "Speech Driven Video Editing via an Audio-Conditioned Diffusion Model"
MIT License
223 stars 15 forks source link

preprocessing scripts #2

Open azuredsky opened 1 year ago

azuredsky commented 1 year ago

Good job, could you upload your data preprocessing scripts

sdulyq commented 1 year ago

Really looking forward to your further code.

baiyuting commented 1 year ago

+1, I am trying to training the Single Speaker Model, I've download identity S1 data in GRID corpus, however, I find that it could not run without right input, could you upload the data preprocessing scripts?

DanBigioi commented 1 year ago

Hi Guys,

Yes, I am submitting a revised version of the manuscript next week, and will also make everything public that day including all datasets and scripts.

@baiyuting Don't bother training the single speaker model yet, the updated version I have will give you much better results, and the way I do the audio conditioning in the current implementation is out of date. I will also be providing a multi-speaker model that works pretty well on unseen subjects.

Thank you so much for the patience, its taken me a bit longer than expected to get the new paper ready.

baiyuting commented 1 year ago

ok, @DanBigioi looking forward to the new version code next week.

DanBigioi commented 1 year ago

Just a heads up guys, I've uploaded everything you need to start training/finetuning/running inference. I'll make some tutorial videos tomorrow too to better understand the code.

DanBigioi commented 1 year ago

@azuredsky @baiyuting @sdulyq

sdulyq commented 1 year ago

@DanBigioi Thank you very much, this is the true spirit of open source on the Internet, and my friend, you are a true hero.