Open Totorol opened 2 years ago
Thanks for your interest in our work.
Assuming batch size 32, we concatenate both pre-STN and post-STN images (so effective batch size 64), and train on both. This makes training a bit harder as the recognition model has to also be able to read the clock with and without STN.
I don't think the results will be different without it, or if so, very minor.
Thanks for such a great job! I am implementing train_fine and have doubts about the following code: img = torch.cat([img, img2], 0) hour = torch.cat([hour, hour2], 0) minute = torch.cat([minute, minute2], 0) Why overlay data