Failed to reproduce the experiments

DJNing commented 8 months ago

Hi, thanks for your excellent work on articulation understanding. Yet, I failed to reproduce the same performance reported in the paper.

I am using the code and the data released in this repo. I've tested in both laptop and stapler and I got the following results.

validation for stapler:

{ "nvs": { "psnr": 25.620492935180664, "ssim": 0.9675073623657227 }, "motion": { "ang_err": 1.7084606885910034, # 0.069 / 10 reported in Table 2? "pos_err": 0.6383935809135437, # 0.006 / 10 reported in Table 2? "geo_dist": 138.68028259277344 }, "surface": { "CD-w": 0.024857331067323685, "CD-s": 0.0019455505535006523, "CD-d": 0.22979260981082916 } }

validation metrics for laptop:

{ "nvs": { "psnr": 29.823352813720703, "ssim": 0.9646984338760376 }, "motion": { "ang_err": 0.048456642776727676, # 0.034 / 10 reported in Table 2? "pos_err": 0.00017119155381806195, # 0.001 / 10 reported in Table 2? "geo_dist": 0.03956468030810356 }, "surface": { "CD-w": 0.01599108800292015, "CD-s": 0.0002230351383332163, "CD-d": 0.03337900713086128 } }

The metric in the laptop looks OK, but the stapler is far worse than reported. I also tried to increase the iteration, but it doesn't seem to be helpful. Is there anything important I missed when running the script? It would be highly appreciated if you could give me some advice on this.

Also, I upload the output image here for the laptop and stapler. it32500_3

it47000_4

SevenLJY commented 8 months ago

Hi @DJNing! Thanks for your interest in our work.

Our training process is essentially an optimization process, which can be easily affected by randomness. I also encountered the unstable training issue with specific cases (e.g. stapler, washer) as you reported. From my experience, there are varying degrees of training difficulty across the ten cases reported in the paper. Objects with relatively thin parts or fewer visual clues present a particular challenge as there are fewer opportunities for these regions to establish the correct correspondence. This can be considered a limitation inherent to our method.

In our experiments, we trained on these challenging cases multiple times, and they can work with probability. Increasing the number of views can, in many cases, help to enhance the likelihood of obtaining consistent positive results.

Hope it helps.

DJNing commented 8 months ago

Thanks for your prompt reply!

3dlg-hcvc / paris

Failed to reproduce the experiments #10