Question about the bench mark

Mowenyii commented 7 months ago

Thank you for your significant contributions to the community! While using your benchmark to test DDIM+P2P, I followed the instructions in your p2p_requirements.txt to install diffusers version 0.10.0. However, after editing the ten tasks of the benchmark, the results I obtained are as follows:

	PSNR ↑	MSE↓	CLIP (Whole)↑	CLIP (Edited)↑
ddim+p2p	17.75815095	223.9520093	23.84047887	21.30606469

There is a discrepancy when compared to the results in Table 1 of your paper. Could you advise on the possible reasons for this? The commands used were:

python run_editing_p2p.py

python evaluation/evaluate.py --metrics "structure_distance" "psnr_unedit_part" "lpips_unedit_part" "mse_unedit_part" "ssim_unedit_part" "clip_similarity_source_image" "clip_similarity_target_image" "clip_similarity_target_image_edit_part" --edit_category_list 0 1 2 3 4 5 6 7 8 9 --tgt_methods 1_ddim+p2p

juxuan27 commented 6 months ago

Hi, @Mowenyii . During our experiments, we found a small difference in metrics for some methods when running on different machines. I think that according to your given results, it is within the fluctuation range.

Mowenyii commented 6 months ago

Thank you for your reply!

cure-lab / PnPInversion

Question about the bench mark #6