Closed youssefabdelm closed 1 year ago
Hi, I am not sure if I understand correctly. The model outputs the predicted overall MOS and the four quality dimensions for each input audio sample. What do you mean exactly by continuous?
Hi, I am not sure if I understand correctly. The model outputs the predicted overall MOS and the four quality dimensions for each input audio sample. What do you mean exactly by continuous?
Oh sorry I was not clear. By continuous I simply meant whether it can export the four quality dimensions for every audio sample (as opposed to the overall metrics for the entire (say 30 sec) file). So if there's a 16kHz file, it would export the metrics for every one of those 16000 samples. (Similar to: https://github.com/vvvm23/stoi-vqcpc)
I was curious what variable should I look at to extract that array of metrics for every audio sample (instead of overall / average score?)
Oh I see - that is not possible, at least not with the model architecture that the pretrained models are using. You could probably train a new model where you restrict the output size to 1 before pooling and use the pre-pooling output as a 'continuous' score.
Ah I see, thank you!
Hi Gabriel, thanks for making such a useful model!
I have 2 files: 1 denoised, and another noisy. In some cases when the quality drops below a certain threshold, I'd like it to move to whichever has the highest quality so to speak (excluding noisiness).
However, I'd like to do this in a smooth way. So my question is, is it possible to export continuous metrics?
Would I be right in my estimation that y_hat_list is a list of metrics? If so, how could I map this back onto number of samples? Or would it be even accurate/advisable to do so in my case?