M4C remove features approach

emanuelevivoli commented 2 years ago

❓ Questions and Help

Hello, I'm new to MMF and its implementation of M4C, and so, I have a doubt. The approach to not using some feature (for example PHOC features for OCR) is that you specify the remove_ocr_phoc property of config.ocr :

# in mmf>models>m4c.py

def _build_ocr_encoding(self):

        self.remove_ocr_fasttext = self.config.ocr.get("remove_ocr_fasttext", False)
        # I don't want to use the PHOC  
        self.remove_ocr_phoc = self.config.ocr.get("remove_ocr_phoc", True)
        self.remove_ocr_frcn = self.config.ocr.get("remove_ocr_frcn", False)
        self.remove_ocr_semantics = self.config.ocr.get("remove_ocr_semantics", False)
        self.remove_ocr_bbox = self.config.ocr.get("remove_ocr_bbox", False)

but you actually calculate all the features in:

# in mmf>models>m4c.py

def _forward_ocr_encoding(self, sample_list, fwd_results):
       # ...

        # OCR PHOC feature (604-dim)
        ocr_phoc = sample_list.context_feature_1
        ocr_phoc = F.normalize(ocr_phoc, dim=-1)
        assert ocr_phoc.size(-1) == 604
        # ...

and eventually, instead or removing the features, you set them as zeros in:

        # ...
        if self.remove_ocr_phoc:
            ocr_phoc = torch.zeros_like(ocr_phoc)
        # ...
        # here are concatenated to all features
        ocr_feat = torch.cat(
            [ocr_fasttext, ocr_phoc, ocr_fc7, ocr_order_vectors], dim=-1
        )
        ocr_bbox = sample_list.ocr_bbox_coordinates
        if self.remove_ocr_semantics:
            ocr_feat = torch.zeros_like(ocr_feat)
        if self.remove_ocr_bbox:
            ocr_bbox = torch.zeros_like(ocr_bbox)
        # here they goes to the Linear layer for having the desired output dimension
        ocr_mmt_in = self.ocr_feat_layer_norm(
            self.linear_ocr_feat_to_mmt_in(ocr_feat)
        ) + self.ocr_bbox_layer_norm(self.linear_ocr_bbox_to_mmt_in(ocr_bbox))
        # ...

So my question is: Have I understood well? In case, there are some "issues":

the models affected by remove_ocr_phoc or remove_ocr_fasttext are loaded in any case (when used and when not);
the Linear layer will be larger than it could be;
in the configs/models/m4c/defaults.yaml you will have the config.orc.mmt_in_dim that needs to be the same as you had the PHOC even if you actually don't have it, and that can be prone to error (at least for me);

Is there a way (that you can suggest) to change this behaviour? If there is, I could work on it.

Thanks a lot, Emanuele

emanuelevivoli commented 2 years ago

Hi @apsdehal , I tag you as I saw the m4c commits are mainly from you. Sorry for bothering but I'd like to have some insight on the m4c features behaviour.

Thanks, Emanuele

apsdehal commented 2 years ago

Tagging @ronghanghu since he actually implemented the code.

ronghanghu commented 2 years ago

Hi @emanuelevivoli, yes, you understood well -- when we removed some features, we still kept their dimensions but set their values to zero. The main benefit of this compared to removing their dimensions is that the model size stays the same, which makes it easier for us to e.g. convert model checkpoints. The downside is that nn.Linear layers can be larger than what it actually needs to be.

But it should also work well if you change it to directly removing the feature dimensions.

facebookresearch / mmf

M4C remove features approach #1227

❓ Questions and Help