Closed sinewy333 closed 4 years ago
Hi, I am not sure what you changed in order to run the code in tensorflow, since it is originally written for theano. But as far as I can see from the information you mentioned, there are at least two problems:
is the data used for training is "depth_1" in NYU dataset. ? or "synthdepth"?when I used the "depth_1",I found that the code didn't detect the position of the hand very well.when I used the "synthdepth",the code detected the position of the hand well.Thank you very much.
Yes, that is correct. The data used for training is the "depth". The " synthdepth" is simply a rendering of a 3d hand model. Thus it does not work well for real camera data.
The code of hand detector is based on gtorig [13]?which is from the data of label.But there is no such data in practice.How do we cut out the pictures with only hands?Thank you very much.
If I understand your question correctly, we use gtorig[13] for training the localizer. During testing, we first use the center of mass for hand detection and the trained localizer for refining this location.
is used the main_nyu_com_refine for for refining the location of the hand?then use the main_nyu_posereg_embedding for training the location of the joints,finally use the ORRef for refine the location of the joints?
Yes, main_nyu_com_refine is for refining the location, and main_nyu_posereg_embedding for pose prediction. ORRef is not published in this repository, but you can optionally add it yourself.
I understand. Thanks a lot!
The test results of the model I trained came out. The test results in test 2 were much worse than those in test 1.Is it because the two data people are different?Does the size of the palm of different people affect the accuracy of the test? the result of your report is frome test1 or test2? look forward to your favourable reply. Thank you!
distance(mm) | train | test1 | test2
10 | 25.84% | 15.82% | 0.12% 20 | 74.57% | 52.01% | 11.44% 30 | 88.89% | 73.44% | 32.71% 40 | 94.54% | 82.38% | 48.61% 50 | 97.43% | 89.18% | 64.52% 60 | 98.85% | 93.81% | 76.74% 70 | 99.43% | 97.01% | 85.31% 80 | 99.76% | 98.44% | 90.31%
This is the result of my test(the fraction of frames Within distance(max))
You are correct that the results for test2 are worse. This is due to the fact that test2 is a different user with different hand size than the training user. Therefore, it is encouraged to adjust the crop size of the hand cube accordingly. The evaluation in the report is from the joint set of test1+test2.
I was using the main_nyu_posereg_embedding code in tensorflow, The output is: Training epoch 100, batch_num 1135, Minibatch Loss= 1.9201 Testing ... Mean error: 483.2774520085452mm, max error: 682.6990653506009mm Testing baseline Mean error: 33.98014831542969mm Mean error: 615.936767578125mm I've tried to change all of the parameters, but I can't reduce the error value.This problem has been bothering me for a long time.So I hope you can tell me,Is the result correct? thank you very much!