performance in real world

xingyizhou / DeepModel

Code repository for Model-based Deep Hand Pose Estimation

GNU General Public License v3.0

111 stars 43 forks source link

performance in real world #18

Open wishinger-li opened 7 years ago

wishinger-li commented 7 years ago

Hi: thank you for your job! I run the code with some pic from the real camera,and got bad results. I followed the steps below: a. get a depth image from the camera b. crop the region including the hand c. run the code. I looked into the image and found the only diff between mine and NYU is : my image is more noisy(gussian)than NYU. Is this lead to the bad results? or did you do some experiments in the real world，and how did it behaves? thanks!

strawberryfg commented 7 years ago

Hi Wishinger,

Thank you for you interest! The preprocessing of depth image in real scenario is vital as images with cluttered background (e.g. human face or human body) can lead to bad results. The normalization of 3D joint location also plays an important role in achieving accurate results. Would you mind if you upload some of the bad results including the raw depth images for reference?

One limitation of NYU dataset is that the pose variety, viewpoint variety (ego-centric or not?), shape variety (different hand scales) is limited. Training set covers one person while only the first 2452 images on testing set contains the same person.

Here http://icvl.ee.ic.ac.uk/hands17/. is a dataset including millions of hand depth images. This dataset is much more diverse than NYU, ICVL. (I do not know much about MSRA dataset) Maybe you can try DeepModel on this dataset and please tell me if you get further progress.

Thank you in advance!

Warmest, Qingfu

wishinger-li commented 7 years ago

rawHands.tar.gz

rawHands.tar.gz hi: thanks very much for your reply! I will try the "hands17" dataset(downloading for the moment) here is my experiments (including the images): test0.png : the image I cut from 772.png provided in the folder test_images test0_result.png : the predicted result of test0.png test4 prtest4 test1.png : image captured from my own camera test1_result.png : the predicted result of test1.png test2 prtest2

The raw data is as follows: size:224*171,element type:float I use a matlab script to read them:

path = ['z/axies_z_',num2str(ii),'.dat'];
 fid = fopen(path,'r');
 img_path = ['z_img/axies_z_',num2str(ii),'.png'];
 [A,COUNT]=fread(fid,[224,171],'float');
 A(A<0.1)=0;
 A(A>0.42)=0;
 A = uint8(A*1000-397); % 1000: convert  length unit from 'm' to 'mm',397 is the depth of the palm %center(manully selected).
 imwrite(A',img_path);

then I manully cropped the palm ,resize it to [128,128]and do the experiments above. Any suggestions are expected!!

strawberryfg commented 7 years ago

Well, I think that the wrong results are caused by data preprecossing. Actually the proportion of hand region after cropping should be smaller than that of yours. And you can add scaling augmentation strategy like the recent paper "DeepPrior++" if you like.

popper0912 commented 5 years ago

rawHands.tar.gz

rawHands.tar.gz hi: thanks very much for your reply! I will try the "hands17" dataset(downloading for the moment) here is my experiments (including the images): test0.png : the image I cut from 772.png provided in the folder test_images test0_result.png : the predicted result of test0.png test1.png : image captured from my own camera test1_result.png : the predicted result of test1.png

The raw data is as follows: size:224*171,element type:float I use a matlab script to read them:
path = ['z/axies_z_',num2str(ii),'.dat'];
 fid = fopen(path,'r');
 img_path = ['z_img/axies_z_',num2str(ii),'.png'];
 [A,COUNT]=fread(fid,[224,171],'float');
 A(A<0.1)=0;
 A(A>0.42)=0;
 A = uint8(A*1000-397); % 1000: convert  length unit from 'm' to 'mm',397 is the depth of the palm %center(manully selected).
 imwrite(A',img_path);
then I manully cropped the palm ,resize it to [128,128]and do the experiments above. Any suggestions are expected!!

Hi! Do you fix it? I meet this same.