Open v-nhandt21 opened 2 years ago
There's no implementation of these metrics in this repository. A code contribution here would be really helpful!
I found some automatic evaluation metrics mentioned in the paper, where can I find these scripts so that I can reproduce the result and compare with others method.
Have you solved this problem?
Could you provide the evaluation code? Thank you!
For FID, I find an implement from torchmetric: https://torchmetrics.readthedocs.io/en/stable/image/frechet_inception_distance.html
But it required input much be in dtype=torch.unit8, while spectrogram I generate is in float, if I try to convert to int, get an error: ZeroDivisionError: float division by zero
def evaluation(mel_infer, mel_gt):
mel_infer = torch.from_numpy(mel_infer).to("cuda").unsqueeze(0).unsqueeze(0).to(torch.uint8).repeat(1,3,1,1)
mel_gt = torch.from_numpy(mel_gt).to("cuda").unsqueeze(0).unsqueeze(0).to(torch.uint8).repeat(1,3,1,1)
mel_infer = torch.clamp(mel_infer, min=1e-5, max=1e-5)
mel_gt = torch.clamp(mel_gt, min=1e-5, max=1e-5)
fid = FrechetInceptionDistance(feature=2048).to("cuda")
fid.update(mel_infer, real=False)
fid.update(mel_gt, real=True)
f = fid.compute()
return f
You can try this:https://github.com/gzhu06/Unconditional-Audio-Generation-Benchmark
Hello,I want to ask a sound stupid question, How can I compute the FID value of the test dataset and train dataset? both of them are real data distribution. for example, If I want to compute the FID value of train dataset, I will input compute_fid_function(train_dataset,train_dataset)?
I found some automatic evaluation metrics mentioned in the paper, where can I find these scripts so that I can reproduce the result and compare with others method.