facebookresearch / audio2photoreal

Code and dataset for photorealistic Codec Avatars driven from audio
Other
2.66k stars 250 forks source link

evaluation code #39

Closed hoyeYang closed 6 months ago

hoyeYang commented 7 months ago

Amazing work! Could you provide the code for evaluating the model?

evonneng commented 7 months ago

Hi! Thank you for your interest! And yes, this is something I can add to the repo as a todo. Thanks for flagging this!

But to help in unblocking you for the time being, to implement the FD score that is in the paper, we adapt this function but instead of calculating the activation statistics, we calculate the statistics from the raw poses.

https://github.com/hukkelas/pytorch-frechet-inception-distance/blob/master/fid.py#L108

Diversity for the static poses was computed with 1) just taking the variance across a temporal sequence and 2) using this function:

def calculate_diversity(activation, diversity_times=10_000):
    assert len(activation.shape) == 2
    assert activation.shape[0] > diversity_times
    num_samples = activation.shape[0]
    first_indices = np.random.choice(num_samples, diversity_times, replace=False)
    second_indices = np.random.choice(num_samples, diversity_times, replace=False)
    dist = linalg.norm(activation[first_indices] - activation[second_indices], axis=1)
    return dist

Hope this helps in the meantime! Please let me know if there is any questions in the meantime.

liujf69 commented 7 months ago

Thank you for your outstanding work. I am new in this field. Could you please provide all the codes for calculating evaluation indicators?

evonneng commented 7 months ago

Hi! thank you for your interest in this work! Providing the full evaluation indicators will be on our list of todo's. I just have to clean it up a bit, but will push a PR soon hopefully!