Nice work! But I reproduce the program's results based on pretrained checkpoints, which are lower than those mentioned in the paper. I would like to know which part of the reason is causing this? And when I saw the image encoding features, the code used a random sampling process. Will it affect the results? Thanks!
Nice work! But I reproduce the program's results based on pretrained checkpoints, which are lower than those mentioned in the paper. I would like to know which part of the reason is causing this? And when I saw the image encoding features, the code used a random sampling process. Will it affect the results? Thanks!