Hi @tyagi-iiitv, I just want to ask how many epochs did you chose to get the same performance as the paper, and the value of the final loss of the model. I ask this because in my case the loss descents quickly in the first epochs to 0 and the prediction confidence of the model in the inference is quite low, like 20%. Also I wanted to ask if you have any idea how to increase the performance on a VLP-16 or maybe some labelled dataset with 16 layers.
Hi @tyagi-iiitv, I just want to ask how many epochs did you chose to get the same performance as the paper, and the value of the final loss of the model. I ask this because in my case the loss descents quickly in the first epochs to 0 and the prediction confidence of the model in the inference is quite low, like 20%. Also I wanted to ask if you have any idea how to increase the performance on a VLP-16 or maybe some labelled dataset with 16 layers.