facebookresearch / InterHand2.6M

Official PyTorch implementation of "InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image", ECCV 2020

Other

676 stars 92 forks source link

ImportError: Cannot load backend 'TkAgg' which requires the 'tk' interactive framework, as 'headless' is currently running #32

Open kdafnis opened 3 years ago

kdafnis commented 3 years ago

Hi,

I am trying to run the demo.py on the sample image in demo folder. Could you please help me to address this problem?

File "demo.py", line 25, in from utils.vis import vis_keypoints, vis_3d_keypoints File "/InterHand2.6M-master/main/../common/utils/vis.py", line 14, in import matplotlib.pyplot as plt File "/anaconda3/envs/interhand/lib/python3.6/site-packages/matplotlib/pyplot.py", line 2336, in switch_backend(rcParams["backend"]) File "/anaconda3/envs/interhand/lib/python3.6/site-packages/matplotlib/pyplot.py", line 287, in switch_backend newbackend, required_framework, current_framework)) ImportError: Cannot load backend 'TkAgg' which requires the 'tk' interactive framework, as 'headless' is currently running

I have already tried to change 'tkagg' to 'TKAgg' in vis.py as below:

import matplotlib matplotlib.use('TKAgg')

Then: import matplotlib.pyplot as plt

Thanks in advance!

kdafnis commented 3 years ago

I found a solution that worked for me. I just deleted matplotlib.use('TKAgg') completely, but the performance is not as expected.

result_2d_cropped

I tried to run the demo in a test image but the result was really poor. I used a custom made bounding box for the hands that has both hands in it. What do you think about it? Is it because of the bounding box or the resolution of the image?

mks0601 commented 3 years ago

That result indicates your bbox is wrong. The model takes a cropped hand image, it cannot output coordinates outside of the bbox. One common mistake is setting bbox as [xmin, ymin, xmax, ymax]. As written in the code, you should set bbox as [xmin, ymin, width, height]. Please save the cropped image for the debugging.

kdafnis commented 3 years ago

Hi,

Thank you very much for your response. Please find attached a sample image and the cropped bounding box. bbox result_2d

mks0601 commented 3 years ago

Hoq did you save the cropped image?

kdafnis commented 3 years ago

I used:

img_path = '36.png' original_img = cv2.imread(img_path) original_img_height, original_img_width = original_img.shape[:2]

prepare bbox

bbox = [265, 185, 28, 70] # xmin, ymin, width, height bbox = process_bbox(bbox, (original_img_height, original_img_width, original_img_height)) img, trans, inv_trans = generate_patch_image(original_img, bbox, False, 1.0, 0.0, cfg.input_img_shape) cv2.imwrite('./bbox.png', img) img = transform(img.astype(np.float32))/255 img = img.cuda()[None,:,:,:]

forward

inputs = {'img': img} targets = {} meta_info = {}

etc..

mks0601 commented 3 years ago

Oh I think there was a bug. I changed here. Could you try again?

kdafnis commented 3 years ago

I tried it again. Here is my result: bbox result_2d

mks0601 commented 3 years ago

Could you give me the original image file? I tried on your result image and the result is like below. result_2d

As I said, the model cannot predict coordinates outside of bbox area. Unfortunately, I think the result will not very good, especially the images are in the wild one. The model is only trained on InterHand2.6M dataset, of which image appearnce is far from that of in-the-wild images. You may want to check my recent works that address this problem. paper1. paper2.

kdafnis commented 3 years ago

Sure. So the original Image is the following:

36_1 bbox

Bounding box information is: bbox = [265, 185, 28, 70] # xmin, ymin, width, height

mks0601 commented 3 years ago

result_2d

As I mentioned, the result is not that good. You may want to check my recent works introduced above.

kdafnis commented 3 years ago

Thanks, I will check them as well. Could you please provide me the code you used to produced the result above since it is slightly better than my result?

mks0601 commented 3 years ago

I just cloned this repo. Nothing changed.

kdafnis commented 3 years ago

This seems weird since I used the same code by providing the following bounding box (bbox = [265, 185, 28, 70] # xmin, ymin, width, height) and the result doesn't match to yours.

I will also check the papers you introduced above. Thanks.

ravitejageeda commented 3 years ago

@mks0601 . I am trying to give the bbox of one hand either right or left, but I could see that model is predicting both the hands...and prediction is not very good. -x 291 -y 87 -wd 187 -ht 241 ###right hand -x 97 -y 87 -wd 203 -ht 260 ###left hand -x 87 -y 60 -wd 408 -ht 277 ###hands

mks0601 commented 3 years ago

I think the result is not very good, but reasonable? Although you make a tight bbox, the process_bbox function extends the bbox so the other hand is included in the input image. You can check this by saving the cropped image. Nevertheless, the model provided in this repo would not generalize well to unseen images because it is trained only on our dataset, of which image appearance is far from that of in-the-wild images. The in-the-wild datasets, such as MSCOCO, should be used together to train the model. The important thing of this dataset is that it is the only dataset that provides GT 3D pose of interacting hands. My recent work used MSCOCO together. result_2d

ravitejageeda commented 3 years ago

Thankyou for the reply...when I select individual hand , I am getting good results but when I give bbox for single hand and select both the hand functions then my results are both hand prediction and not very good and getting both hand predictions like you said in the function. Could you pls share the github link of new work if uploaded as I dont see it now.

mks0601 commented 3 years ago

The codes of the new work will be released once it is accepted. Thanks!