About Evaluation - Githubissues

david-svitov / HAHA

HAHA: Highly Articulated Gaussian Human Avatars with Textured Mesh Prior

89 stars 9 forks source link

About Evaluation #11

Closed haoz19 closed 1 month ago

haoz19 commented 1 month ago

Hi,

Thanks for the great work!

I ran the training code, and also the rendering code for male3-casual, and I saw that it shows psnr : 24.94 during training, and during testing, it doesn't show psnr. While it is lower than reported in the paper which is 31.46.

I would like to ask if the issue is caused by the parameters I used for training or some other reasons.

Many Thanks, Hao

david-svitov commented 1 month ago

Hi! To calculate metrics for the test, please use calculate_metrics.py

Thanks. I noticed a problem with the male-3-casual training config. Committed the corrected version

haoz19 commented 1 month ago

Thanks for the instruction!

Should we use this:

python calculate_metrics.py --ground_truth <gt_folder> --predict <pred_folder>

gt folder: rgb_image in test folder pred_folder: rasterization in test folder?

Thanks!

duy-maimanh commented 1 month ago

@haoz19 Did you get the evaluation result same with paper? When I ran the calculate_metrics the result is really low. python calculate_metrics.py --predict ./logs/gaussians_docker_female3/female-3-casual-2024_06-05_16-10-test/test/rasterization --ground_truth ./logs/gaussians_docker_female3/female-3-casual-2024_06-05_16-10-test/test/rgb_image male3: psnr: tensor(14.6720, device='cuda:0') ssim: tensor(0.8615, device='cuda:0') lpips: tensor(0.1739, device='cuda:0') male4: psnr: tensor(15.7759, device='cuda:0') ssim: tensor(0.8782, device='cuda:0') lpips: tensor(0.1730, device='cuda:0') female3: psnr: tensor(13.6854, device='cuda:0') ssim: tensor(0.8767, device='cuda:0') lpips: tensor(0.1674, device='cuda:0')

david-svitov commented 1 month ago

Okay, let's localize the problem. Check that the background on GT and generated images is the same color (black). Then, ensure that the metrics from the pre-trained downloaded checkpoint match the article's ones. To understand if the problem in measuring metrics or training

haoz19 commented 1 month ago

Hi @duy-maimanh ,

I got results like this:

00016: psnr: tensor(27.3044, device='cuda:0') ssim: tensor(0.9398, device='cuda:0') lpips: tensor(0.0399, device='cuda:0')

male-3: psnr: tensor(26.5395, device='cuda:0') ssim: tensor(0.9617, device='cuda:0') lpips: tensor(0.0376, device='cuda:0')

male-4: psnr: tensor(24.6781, device='cuda:0') ssim: tensor(0.9438, device='cuda:0') lpips: tensor(0.0567, device='cuda:0')

duy-maimanh commented 1 month ago

@david-svitov @haoz19 Thank you very much for your answer.