Nested object detection

I am working on a problem where I have to detect an object and then detect sub parts of the objects. For example: Finding the face in and image and then finding eyes in the face. When I train 2 different models, 1 for detecting Face only and other for detecting eyes in the face. I get detection confidence of 0.9+ in all the cases. But when I try to detect both face and eyes in the image with a single model then I cannot get a confidence score greater than 0.8+ on various images. I believe the problem is due to that fact that all the major object detection algorithms treat each object as an independent entity. Therefore it assumes that eyes are kind of creating occlusion on face. That is why confidence is lower. Is there a way where I can train a model which learns the inherent relations between the presence of different objects and sub-objects?

System information Operating system: ? Ubuntu 16 Compiler version: ? g++-5 CUDA version: ? 9.2 cuDNN version: ? 7.5 GPU models (for all devices if they are not all the same): ? 1 Nvidia GPU python --version output: ? 3.6

facebookresearch / Detectron

Nested object detection #871