Closed Akshay1-6180 closed 1 month ago
@Akshay1-6180 the original OpenAI CLIP model has no bias on the final vision and text tower projections, so this was to stick closer to that... but, no reason it wouldn't work, or possibly be better in some cases...
In the timm vision adapter there's a config value for the bias https://github.com/mlfoundations/open_clip/blob/3ff1faf10b60be27252be7f6c84ce7c8c5e14ec8/src/open_clip/timm_model.py#L102-L108
A hf_proj_bias
could be added... https://github.com/mlfoundations/open_clip/blob/3ff1faf10b60be27252be7f6c84ce7c8c5e14ec8/src/open_clip/model.py#L52-L83
@rwightman Thanks for the detailed reply , I guess the openAI team through empirical analysis might have seen that there is no difference is adding bias or making the mlp layer more dense and might have gone for the simplest one for interpretability. (Occam's razor)
In the code given here in this file https://github.com/mlfoundations/open_clip/blob/main/src/open_clip/hf_model.py for the projection head the bias is turned to False , I feel it shouldnt matter and keeping it as True would make it better , is there any particular reason this was kept as False