Closed captainst closed 2 years ago
I will wait for a response :smile:
Based on my experience, I did not see any downgrade performance at 224 resolution compared with 384
Closing as this is a question about modeling generally, suitable for research discussion in the original research repository, and not a problem with the code in this repository.
Hi there,
I saw the implementation using a convolution to generate fixed size hidden vector from a variable size of input image. That's brilliant! However, I am wondering if the fine-tuning result would be degradated, using a different input image size, say, 224, rather than the official input size, 384, as shown in your example.
Many thanks !