qqlu / Entity

EntitySeg Toolbox: Towards Open-World and High-Quality Image Segmentation
Other
695 stars 58 forks source link

"High Quality Segmentation for Ultra High-resolution Images" don't see the difference between image #24

Open trinh-hoang-hiep opened 2 years ago

trinh-hoang-hiep commented 2 years ago

hello, thanks for your great work High Quality Segmentation. I want to ask the following 2 questions

  1. I don't see any difference in P position information between images. Even in an image, the information of the variable rel_cell is the same in all locations. So why can CRM generate detailed segmentation masks?
  2. What is the meaning of calculating 3 features in P because it seems to me that it is hard-fixed and concatenated into each feature of the image? below is the map that I printed out for the variables

image image

tcShen commented 2 years ago

(1) For rel_cell, when the input resolution changes (the target resolution is constant), it is not same. That CRM generates detailed segmentation masks not only results from rel_cell. Other designs also contribute to the final results. (2) The hard-fixed position information doesn't mean it's unsuitable for representation. Fixed and learnable position information is both used in the implicit representation. Thanks.