ziplab / LITv2

[NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "Fast Vision Transformers with HiLo Attention"
Apache License 2.0
229 stars 11 forks source link

how much the cost of the trained memory for segmentation on three backbones #2

Closed seabearlmx closed 2 years ago

seabearlmx commented 2 years ago

Hi, thanks for releasing your code.

I have some questions, could you tell me how much the cost of the trained memory for segmentation on three backbones, including LITv2-S, LITv2-M, and LITv2-B, respectively. And how many the number of the batch size?

HubHop commented 2 years ago

Hi @seabearlmx, thanks for your interest.

We train each downstream model with a total batch size of 16 on 8 GPUs (2 samples per GPU). Not sure if you refer to semantic segmentation on ADE20K or instance segmentation on COCO. The following results are tested by checking memory usage with nvidia-smi during training, without using checkpointing.

Based on ADE20K and Semantic FPN, LITv2-S, LITv2-M, and LITv2-B cost 5539MB, 6931MB, 8405MB per GPU, respectively. Based on COCO-2017 and Mask R-CNN, LITv2-S, LITv2-M, and LITv2-B cost 16305MB, 29007MB, 41579MB per GPU, respectively.

We use checkpointing to train LITv2-B with Mask R-CNN on 8 32GB V100, in which case it only costs 9987MB per GPU.

seabearlmx commented 2 years ago

Hi, thank you for your reply, this is useful to me.