Closed zoink closed 4 years ago
It seems that "Everybody Dance Now" uses a separate network for the face. I would also be interested in further differences. Hopefully they will release the code as well.
What's more, the Everyboday Dance Now paper only uses two consecutive frames in their temporal consistency loss while vid2vid uses more frames. And vid2vid also has the optical flow loss.
I know the code for the Berkeley paper hasn't been released, but any observations/comments on differences between the (a) temporal smoothing process and (b) the results would be appreciated. Thanks so much!