mlfoundations / open_clip

An open source implementation of CLIP.
Other
9.29k stars 923 forks source link

Is there any particular reason why bias term is kept as False in the projection layers #807

Closed Akshay1-6180 closed 1 month ago

Akshay1-6180 commented 5 months ago

In the code given here in this file https://github.com/mlfoundations/open_clip/blob/main/src/open_clip/hf_model.py for the projection head the bias is turned to False , I feel it shouldnt matter and keeping it as True would make it better , is there any particular reason this was kept as False


        d_model = getattr(self.config, arch_dict[self.config.model_type]["config_names"]["width"])
        if (d_model == output_dim) and (proj_type is None):  # do we always need a proj?
            self.proj = nn.Identity()
        elif proj_type == 'linear':
            self.proj = nn.Linear(d_model, output_dim, bias=False)
        elif proj_type == 'mlp':
            hidden_size = (d_model + output_dim) // 2
            self.proj = nn.Sequential(
                nn.Linear(d_model, hidden_size, bias=False),
                nn.GELU(),
                nn.Linear(hidden_size, output_dim, bias=False),
            )
rwightman commented 5 months ago

@Akshay1-6180 the original OpenAI CLIP model has no bias on the final vision and text tower projections, so this was to stick closer to that... but, no reason it wouldn't work, or possibly be better in some cases...

In the timm vision adapter there's a config value for the bias https://github.com/mlfoundations/open_clip/blob/3ff1faf10b60be27252be7f6c84ce7c8c5e14ec8/src/open_clip/timm_model.py#L102-L108

A hf_proj_bias could be added... https://github.com/mlfoundations/open_clip/blob/3ff1faf10b60be27252be7f6c84ce7c8c5e14ec8/src/open_clip/model.py#L52-L83

Akshay1-6180 commented 5 months ago

@rwightman Thanks for the detailed reply , I guess the openAI team through empirical analysis might have seen that there is no difference is adding bias or making the mlp layer more dense and might have gone for the simplest one for interpretability. (Occam's razor)