Open lstswb opened 11 months ago
@lstswb have u solved this?
@lstswb have u solved this?
Not yet
Does the code snippet from the example help?
In particular
note that you can adjust process_batch_size
for a smaller memory footprint and note the use of sam_model_fast_registry
Does the code snippet from the example help?
In particular
note that you can adjust
process_batch_size
for a smaller memory footprint and note the use ofsam_model_fast_registry
I tried to adjust batch_size, and the GPU memory footprint was reduced, but it still far exceeded the original code.
Yes, the batch is larger, but should be faster. The original code uses batch size 1. You can try setting it to batch size 1.
Yes, the batch is larger, but should be faster. The original code uses batch size 1. You can try setting it to batch size 1.
I tried to adjust batch_size=1, but still got GPU memory error.
Hm, I assume you're also using the GPU for the display manager? That will take up additional memory as well. Maybe the solution in https://github.com/pytorch-labs/segment-anything-fast/issues/97 will help.
Can you use your onboard GPU (if you have one) for the display manager and the GPU for the model only? Does it work with vit_b?
Hm, I assume you're also using the GPU for the display manager? That will take up additional memory as well. Maybe the solution in #97 will help.
Can you use your onboard GPU (if you have one) for the display manager and the GPU for the model only? Does it work with vit_b?
Vit_b can be used normally. Display takes up only a small portion of GPU memory. Setting vit_h equally works fine with the original code.
Hm, can you try setting the environment variable SEGMENT_ANYTHING_FAST_USE_FLASH_4
to 0?
Hm, can you try setting the environment variable
SEGMENT_ANYTHING_FAST_USE_FLASH_4
to 0? SEGMENT_ANYTHING_FAST_USE_FLASH_4 has been set to 0. But the problem remains.
Hm, I'm not sure to be honest. It seems to work on other 4090s, but I think they're on Linux and not Windows.
Hm, I'm not sure to be honest. It seems to work on other 4090s, but I think they're on Linux and not Windows.
Well. I try to do this using a linux system instead of using ubuntu based on WSL2
GPU:4090 24G System:Ubuntu for WSL2 Model:sam_vit_h Image_size:[1024,1024] Parameter settings: model=sam, points_per_side=128, points_per_batch = 64, pred_iou_thresh=0.86, stability_score_thresh=0.92, crop_n_layers=3, crop_n_points_downscale_factor=2, min_mask_region_area=100, process_batch_size=4 Issue: When I use SamAutomaticMaskGenerator,GPU memory usage up to 55GB. And there will be an error. [torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 16.63 GiB. GPU 0 has a total capacity of 23.99 GiB of which 0 bytes is free. Including non-PyTorch memory, this process has 17179869184.00 GiB memory in use. Of the allocated memory 34.86 GiB is allocated by PyTorch, and 5.63 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) ] However, when using the original SAM code, this problem does not exist, and the GPU memory will not exceed 24GB.