Pointcept / PointTransformerV3

[CVPR'24 Oral] Official repository of Point Transformer V3 (PTv3)
MIT License
583 stars 30 forks source link

memory usage #37

Open lileai opened 2 months ago

lileai commented 2 months ago

Thanks for your work! It is indeed faster than many point cloud feature extraction algorithms, but when I was training with the Semantic-KITTI dataset I found that just using a batch_size of 2 resulted in about 22g of memory usage (WITHOUT FLASH ATTENTION) I would like to know what is the approximate memory usage when you are training? Any suggestions on how to do this?

Gofinge commented 2 months ago

Hi, without FlashAttention, the memory cost will increase with patch size. We recommend a patch size of 128 or 256 when disabling FlashAttention, which will achieve a good balance in memory - speed - performance.

lileai commented 2 months ago

Thank you very much for your answer, it's very useful for me because I don't use this algorithm based on Linux environment, so I can't adapt flash attention very well, so can you give me a benchmark for video memory usage? For example, how much memory is used for batch_size=2 when path_size=256, this makes a lot of sense to me!