Closed Iranb closed 1 year ago
Some confusion after reading the code: In clip_encoder.py, lines 38-39
if self.logit_scale > 4.605: self.logit_scale.data = torch.tensor(4.605).to(self.logit_scale.device)
What does 4.605 mean? How does this value compute from?
It is the threshold that CLIP uses during training. To prevent the scaling factor from going beyond 100 (exp^4.605).
Some confusion after reading the code: In clip_encoder.py, lines 38-39
What does 4.605 mean? How does this value compute from?