Thank you so much for your awesome work! I had some clarification questions:
Could you give some insights on how much computation/memory does windowed attentions as in stratified transformer/swin3d saves? (in terms of GBs/batch size/convergence time etc.)
Hi,
Thank you so much for your awesome work! I had some clarification questions:
Thank you!