xxxnell / how-do-vits-work

(ICLR 2022 Spotlight) Official PyTorch implementation of "How Do Vision Transformers Work?"
https://arxiv.org/abs/2202.06709
Apache License 2.0
798 stars 77 forks source link

Running time for visualizing the loss landscape #16

Closed waitingcheung closed 2 years ago

waitingcheung commented 2 years ago

Thank you for the amazing work! The summary organizes the main points of the paper really well and helps facilitate future research.

I would like to visualize the loss landscape for my model and I am trying the Python notebook in this repository on Colab. Apparently it is a long process and I would like to have a rough estimation on the time it will take for my model.

Could you kindly advise how long it took for you to run the notebook on Colab or on your machine?

xxxnell commented 2 years ago

Hi, @waitingcheung . Thank you for your support. The loss landscape visualization evaluates the model on a grid with 441(=21×21) points by default. So, the running time for the code to investigate the loss is equal to the time for training the model for 441 epochs. This can take tens of hours to days.