CUDA out of memory - Githubissues

Pointcept / PointTransformerV3

[CVPR'24 Oral] Official repository of Point Transformer V3 (PTv3)

MIT License

587 stars 30 forks source link

CUDA out of memory #15

Closed Sylva-Lin closed 3 months ago

Sylva-Lin commented 3 months ago

Hi, thank you for your work. When running S3DIS, the GPU memory is insufficient, the training configuration is 4RTX3090, the batch_size is set to 4, and the maximum number of points is set to 25600. In the training stage, the GPU memory is sufficient, but in the validation stage, the GPU memory is insufficient because the number of input point clouds is a normal room. What GPU can satisfy? I used 43090 with 24GB of GPU for my experiment

Gofinge commented 3 months ago

We use A100s for all experiments except the inference time benchmark. From your description, the error occurred in the evaluation as we set grid_size to 0.01 and no crop during the evaluation. Maybe you can also add a SphereCrop during the crop, the test process after training will provide a more precise result. (If the default test process is also OOM, you might need to enable a special crop strategy during our precise test, if this occurs, I will tell you how to handle it later.)

Sylva-Lin commented 3 months ago

We use A100s for all experiments except the inference time benchmark. From your description, the error occurred in the evaluation as we set grid_size to 0.01 and no crop during the evaluation. Maybe you can also add a SphereCrop during the crop, the test process after training will provide a more precise result. (If the default test process is also OOM, you might need to enable a special crop strategy during our precise test, if this occurs, I will tell you how to handle it later.)

Thanks, I added Sphere Crop and it was able to train and evaluate normally.

Sylva-Lin commented 3 months ago

We use A100s for all experiments except the inference time benchmark. From your description, the error occurred in the evaluation as we set grid_size to 0.01 and no crop during the evaluation. Maybe you can also add a SphereCrop during the crop, the test process after training will provide a more precise result. (If the default test process is also OOM, you might need to enable a special crop strategy during our precise test, if this occurs, I will tell you how to handle it later.)

Hi, as you said, I am now experiencing OOM again in the testing phase, how can I introduce the crop in the testing phase?

Gofinge commented 3 months ago

Set a SphereCrop in "data.test.test_cfg.crop" and set the mode of "ShpereCrop" is "all" (https://github.com/Pointcept/Pointcept/blob/main/pointcept/datasets/transform.py#L927)