facebookresearch / sapiens

High-resolution models for human tasks.
https://about.meta.com/realitylabs/codecavatars/sapiens/
Other
4.31k stars 230 forks source link

Det replace seg for depth? #131

Closed GloryyrolG closed 2 weeks ago

GloryyrolG commented 2 weeks ago

hi @rawalkhirodkar et al.,

thx for ur great contribution to the community!

may i ask why not to use a person detector like detectron2/yolox to replace sensitive segmentation? just like what is performed for topdown pose est.

Ref: https://github.com/facebookresearch/sapiens/issues/64,

may I ask to clarify if it means segmentation is more important than depth estimation, etc.? according to the demo, once it does not segment the person, depth estimation makes no sense. thx & best,

https://github.com/facebookresearch/sapiens/issues/41

rawalkhirodkar commented 2 weeks ago

@GloryyrolG you can indeed replace the segmentation with any method of your choice. The depth visualization only requires human pixels vs non-human pixel classification to normalize predicted depth to 0-1.

We now support a more robust binary (fg/bg) segmentor, already added to the huggingface-demos https://huggingface.co/facebook/sapiens-seg-foreground-1b-torchscript. The repository will be updated soon with the details.

"may I ask to clarify if it means segmentation is more important than depth estimation, etc.?.." This is not true, both tasks are independent networks and serve different purposes.