I have couple of questions regarding the tagging method that I hope you will be able to address them.
For the file group.py class Params:
For the following self.partOrder = [i-1 for i in [1,2,3,4,5,6,7,12,13,8,9,10,11,14,15,16,17]] would you be able to explain where this order comes from exactly?
The self.max_num_people = 30, is there a specific reason why this is set to 30, or is this just a number that is based on empirical evaluation as to cover the max number of people that might appear with respect to COCO images? If so, would reducing/increasing this number for own dataset have any effect on the output?
Why is pooling used as a choice for NMS? And is there any specific reason for kernel=3, padding=1, I believe its purpose is to preserve the shape of the input map, if so, would the use of kernel=5, padding=2 be also a valid choice?
What is the reasoning behind the choice of threshold values for self.detection_threshold = 0.2 and self.tag_threshold = 1.?
Would you be able to briefly explain how the match_by_tag(*) works? And, what is the purpose
For the file group.py class HeatmapParser:
Would you be able to briefly explain the purpose of adjust(*) ?
Sorry for the long post, please bare in mind I have a limited experience in the area (just getting into keypoint prediction) and therefore some of the questions might be absolutely trivial to answer (I hope). Anyway, thank you in advance for your help, it's a great piece of work with some really awesome insights!
Hello,
I have couple of questions regarding the tagging method that I hope you will be able to address them.
For the file
group.py
classParams
:self.partOrder = [i-1 for i in [1,2,3,4,5,6,7,12,13,8,9,10,11,14,15,16,17]]
would you be able to explain where this order comes from exactly?self.max_num_people = 30
, is there a specific reason why this is set to 30, or is this just a number that is based on empirical evaluation as to cover the max number of people that might appear with respect to COCO images? If so, would reducing/increasing this number for own dataset have any effect on the output?kernel=3, padding=1
, I believe its purpose is to preserve the shape of the input map, if so, would the use ofkernel=5, padding=2
be also a valid choice?self.detection_threshold = 0.2
andself.tag_threshold = 1.
?match_by_tag(*)
works? And, what is the purposeFor the file
group.py
classHeatmapParser
:adjust(*)
?Sorry for the long post, please bare in mind I have a limited experience in the area (just getting into keypoint prediction) and therefore some of the questions might be absolutely trivial to answer (I hope). Anyway, thank you in advance for your help, it's a great piece of work with some really awesome insights!