Gengzigang / PCT

This is an official implementation of our CVPR 2023 paper "Human Pose as Compositional Tokens" (https://arxiv.org/pdf/2303.11638.pdf)
MIT License
332 stars 21 forks source link

Is this model only works well for swin backbone?? #4

Open gmk11 opened 1 year ago

gmk11 commented 1 year ago

I tried to change the backbone to see the performances of the model , i retrained my backbone(Resnet) with heatmap supervision on coco as said in the doc of the repository and then trained the tokenizer , and the Classifier but i get bad Precision for the final model (AP = 0.150 and AP@0.5 = 0.4) is there something i missed ? or just the model works only for swin?

Gengzigang commented 1 year ago

We have not tried using ResNet as a backbone. Could you provide more details on the performance of each of your three steps (heatmap, tokenizer, classifier)?

gmk11 commented 1 year ago

I checked my code and saw that the heatmap weights weren't initilalized for my backbone, i am doing a new training of the tokenizer and the classifier right now. i got this performance for my heatmap training: Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.412 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.760

this for the tokenizer : Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.843 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.975

gmk11 commented 1 year ago

@Gengzigang by the way if the tokenizer doesn't depend of the backbone , what will be the consequences if i use the well trained tokenizer weights pretrained with the swin backbone on my model using Resnet as backbone ?

If i well understood your paper , when we are training the tokenizer , it's the encoder , codebook and decoder weights that are updated. And when we train the classifier we use this previous pretrained decoder to decode the poses , by having a strong decoder the poses will be better recovered right? so the performances would be better

gmk11 commented 1 year ago

@Gengzigang ???

Gengzigang commented 1 year ago

If you want to use ResNet as the backbone and train the classifier using a pre-trained tokenizer, there should be no problems. You can choose to directly load the backbone-agnostic tokenizer we provided (https://github.com/Gengzigang/PCT#remove-image-guidance), which has been trained without image guidance, to train your classifier.

Gengzigang commented 1 year ago

Regarding what you mentioned about a better decoder, our current experiments have shown that the decoder we are currently using is already sufficient. If the classification is completely accurate, the average precision (AP) can reach 99.

gmk11 commented 1 year ago

thanks , what do you think about the results of my heatmap ? is it sufficient?

Gengzigang commented 1 year ago

You can directly use the ResNet backbone of Simple Baseline (https://arxiv.org/abs/1804.06208). MMPose has supported it.

gmk11 commented 1 year ago

this is what i used... , did you have a better precision when you trained the heatmap of your models i mean (AP over 70)?

Gengzigang commented 1 year ago

Yes, our backbone is well-trained.

gmk11 commented 1 year ago

thanks for your time and responses

ll8657 commented 12 months ago

I checked my code and saw that the heatmap weights weren't initilalized for my backbone, i am doing a new training of the tokenizer and the classifier right now. i got this performance for my heatmap training: Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.412 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.760

this for the tokenizer : Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.843 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.975

Excuse me, how do you enable custom backbone network weights

gmk11 commented 12 months ago

area

I created a backbone like the implementation of the swin backbone , and I uploaded my weight in it like when you use a resnet backbone from PyTorch , and I désactivated the initial upload weight function when I trained the heatmap and tokenizer and reactivated when I trained the classifier.

You can create for example , create a Resnet class in the same file where swin class is implemented then you can replace the backbone in the file where backbone= ‘swin’ by backbone = ‘Resnet’

hope it help

ll8657 commented 12 months ago

area

I created a backbone like the implementation of the swin backbone , and I uploaded my weight in it like when you use a resnet backbone from PyTorch , and I désactivated the initial upload weight function when I trained the heatmap and tokenizer and reactivated when I trained the classifier.

You can create for example , create a Resnet class in the same file where swin class is implemented then you can replace the backbone in the file where backbone= ‘swin’ by backbone = ‘Resnet’

hope it help

Thanks for your warm reply, I have now replaced a custom backbone network and trained on the heat map supervision with the final accuracy AP=0.74. Since I don't see the code to import weights, if I want to train tokenizer and classifiers with this weight, I just need to modify Backbone='hmp_backbone' in the config file. Will Pytorch automatically import my heat map monitor weights

JasOleander commented 3 months ago

@Gengzigang Hello, could you tell me about training time with and without image guidance for tokenizer? I didn't see that in your paper. And, what is the output of decoder (integer or logits) for classifier you mentioned in figure 1?

JasOleander commented 3 months ago

I checked my code and saw that the heatmap weights weren't initilalized for my backbone, i am doing a new training of the tokenizer and the classifier right now. i got this performance for my heatmap training: Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.412 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.760

this for the tokenizer : Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.843 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.975

Hello. Did you train hmp_base.py file again with pretrained 'ResNet model weights'? Or tokenizer.py file with pretrained 'ResNet Heatmap' weights?