JuanFMontesinos / VoViT

VoViT: Low Latency Graph-based Audio-Visual VoiceSeparation Transformer
https://ipcv.github.io/VoViT/
34 stars 9 forks source link

All code and pre-trained models #5

Open attutude opened 1 year ago

attutude commented 1 year ago

Thank you for your excellent work, can you release all the code and pre-training model, I want to compare in the same batch size.

JuanFMontesinos commented 1 year ago

Hi, as I mentioned in the other post, I tend not to relase the code as the dataloaders are not straight forward to use if not having an exact, identical copy of the training data, since they depends on IDs that are generated automatically.

I think you can pretty much use whichever framework (that uses complex mask) to get the results. The weights are available in the Release section of the repository

If it's really important I can try to release the code, but it's not as clean as this repo nor straight-forward to run.