Closed Ivvvvvvvvvvy closed 1 year ago
Hi,
(1) from what you mention, you seem to get nice and relevant samples in samples_2023-05-22T21-27-45
(check if they make sense to you if you play them). Hence, low FID and low KL (this is the MKL). Have you tried to hold out a set of videos that were not used during training? How does it perform on them compared to training?
(2) yes these are MKLs no need to do anything extra. Sorry for the confusing notations.
Hi,
(1) from what you mention, you seem to get nice and relevant samples in
samples_2023-05-22T21-27-45
(check if they make sense to you if you play them). Hence, low FID and low KL (this is the MKL). Have you tried to hold out a set of videos that were not used during training? How does it perform on them compared to training?(2) yes these are MKLs no need to do anything extra. Sorry for the confusing notations.
Thank you for your guidance! I haven't sent the sampling results to melgan , so I can't feel the effect yet. But the effect in the val folder generated during the training of the transformer is not bad. The data used in the evaluation belongs to valid.txt. Does the data in valid.txt participate in the training of the transformer?Do I need to test with data that has never appeared in train and validation?
Hello, I used your codebook pre-trained on the VGGsound dataset, and then trained the transformer on my own dataset(seven categories), and now I am doing evaluation work, but I am a little confused about the evaluation results. (1) My results are as follows:
samples_2023-05-22T21-27-45: KL: 2.64812; ISc: 3.48715 (0.028569); FID: 3.94182; KID: 0.00128 (0.00008)
. The FID is much smaller than that given in your paper. Is this reasonable? Does it mean that there is something wrong with my training process, such as overfitting? (2)Is the KL value the MKL mentioned in the paper? If not, how can I calculate the MKL?