SysCV / sam-pt

SAM-PT: Extending SAM to zero-shot video segmentation with point-based tracking.
https://arxiv.org/abs/2307.01197
Apache License 2.0
950 stars 60 forks source link

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 2.00 GiB total capacity; 1.69 GiB already allocated; 0 bytes free; 1.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. #14

Closed wang-shuaikang closed 3 months ago

wang-shuaikang commented 1 year ago

Error executing job with overrides: ["frames_path='D:\computer_vision\sam-pt\data\demo_data\demo'", 'query_points_path=null', 'longest_side_length=1024', 'frame_stride=1', 'max_fr ames=-1'] Traceback (most recent call last): File "E:\Anaconda3\envs\sam\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "E:\Anaconda3\envs\sam\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "D:\computer_vision\sam-pt\demo\demo.py", line 372, in main() File "E:\Anaconda3\envs\sam\lib\site-packages\hydra\main.py", line 94, in decorated_main _run_hydra( File "E:\Anaconda3\envs\sam\lib\site-packages\hydra_internal\utils.py", line 394, in _run_hydra _run_app( File "E:\Anaconda3\envs\sam\lib\site-packages\hydra_internal\utils.py", line 457, in _run_app run_and_report( File "E:\Anaconda3\envs\sam\lib\site-packages\hydra_internal\utils.py", line 223, in run_and_report raise ex File "E:\Anaconda3\envs\sam\lib\site-packages\hydra_internal\utils.py", line 220, in run_andreport return func() File "E:\Anaconda3\envs\sam\lib\site-packages\hydra_internal\utils.py", line 458, in lambda: hydra.run( File "E:\Anaconda3\envs\sam\lib\site-packages\hydra_internal\hydra.py", line 132, in run = ret.return_value File "E:\Anaconda3\envs\sam\lib\site-packages\hydra\core\utils.py", line 260, in return_value raise self._return_value File "E:\Anaconda3\envs\sam\lib\site-packages\hydra\core\utils.py", line 186, in run_job ret.return_value = task_function(task_cfg) File "D:\computer_vision\sam-pt\demo\demo.py", line 50, in main model = load_model(cfg, positive_points_per_mask, negative_points_per_mask) File "D:\computer_vision\sam-pt\demo\demo.py", line 111, in load_model return model.to("cuda" if torch.cuda.is_available() else "cpu").eval() File "E:\Anaconda3\envs\sam\lib\site-packages\torch\nn\modules\module.py", line 927, in to return self._apply(convert) File "E:\Anaconda3\envs\sam\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply module._apply(fn) File "E:\Anaconda3\envs\sam\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply module._apply(fn) File "E:\Anaconda3\envs\sam\lib\site-packages\torch\nn\modules\module.py", line 579, in _apply module._apply(fn) [Previous line repeated 3 more times] File "E:\Anaconda3\envs\sam\lib\site-packages\torch\nn\modules\module.py", line 602, in _apply param_applied = fn(param) File "E:\Anaconda3\envs\sam\lib\site-packages\torch\nn\modules\module.py", line 925, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 2.00 GiB total capacity; 1.69 GiB already allocated; 0 bytes free; 1.74 GiB reserved in total by PyTorch) If reser ved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF wandb: Waiting for W&B process to finish... (failed 1). Press Ctrl-C to abort syncing. wandb: / 0.021 MB of 0.021 MB uploaded (0.000 MB deduped) wandb: Run summary: wandb: work_dir D:\computer_vision\s... wandb: wandb: View run debug_72_2023.08.12_15.26.10 at: https://wandb.ai/fangyuanguyue38/demo/runs/xodo3pvq wandb: Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: .\wandb\run-20230812_152614-xodo3pvq\logs Exception in thread ChkStopThr: Traceback (most recent call last): File "E:\Anaconda3\envs\sam\lib\threading.py", line 932, in _bootstrap_inner self.run() File "E:\Anaconda3\envs\sam\lib\threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "E:\Anaconda3\envs\sam\lib\site-packages\wandb\sdk\wandb_run.py", line 278, in check_stop_status self._loop_check_status( File "E:\Anaconda3\envs\sam\lib\site-packages\wandb\sdk\wandb_run.py", line 216, in _loop_check_status local_handle = request() File "E:\Anaconda3\envs\sam\lib\site-packages\wandb\sdk\interface\interface.py", line 787, in deliver_stop_status Fatal Python error: could not acquire lock for <_io.BufferedWriter name=''> at interpreter shutdown, possibly due to daemon threads Python runtime state: finalizing (tstate=000001B9DBB4C980)

Thread 0x0000524c (most recent call first): File "E:\Anaconda3\envs\sam\lib\threading.py", line 302 in wait File "E:\Anaconda3\envs\sam\lib\site-packages\sentry_sdk_queue.py", line 240 in get File "E:\Anaconda3\envs\sam\lib\site-packages\sentry_sdk\worker.py", line 127 in _target File "E:\Anaconda3\envs\sam\lib\threading.py", line 870 in run File "E:\Anaconda3\envs\sam\lib\threading.py", line 932 in _bootstrap_inner File "E:\Anaconda3\envs\sam\lib\threading.py", line 890 in _bootstrap

Thread 0x000053b0 (most recent call first): File "E:\Anaconda3\envs\sam\lib\site-packages\sentry_sdk\sessions.py", line 117 in _thread File "E:\Anaconda3\envs\sam\lib\threading.py", line 870 in run File "E:\Anaconda3\envs\sam\lib\threading.py", line 932 in _bootstrap_inner File "E:\Anaconda3\envs\sam\lib\threading.py", line 890 in _bootstrap

Thread 0x0000157c (most recent call first): File "E:\Anaconda3\envs\sam\lib\threading.py", line 306 in wait File "E:\Anaconda3\envs\sam\lib\threading.py", line 558 in wait File "E:\Anaconda3\envs\sam\lib\site-packages\wandb\sdk\lib\mailbox.py", line 126 in _wait File "E:\Anaconda3\envs\sam\lib\site-packages\wandb\sdk\lib\mailbox.py", line 130 in _get_and_clear File "E:\Anaconda3\envs\sam\lib\site-packages\wandb\sdk\lib\mailbox.py", line 283 in wait File "E:\Anaconda3\envs\sam\lib\site-packages\wandb\sdk\wandb_run.py", line 224 in _loop_check_status File "E:\Anaconda3\envs\sam\lib\site-packages\wandb\sdk\wandb_run.py", line 260 in check_network_status File "E:\Anaconda3\envs\sam\lib\threading.py", line 870 in run File "E:\Anaconda3\envs\sam\lib\threading.py", line 932 in _bootstrap_inner File "E:\Anaconda3\envs\sam\lib\threading.py", line 890 in _bootstrap

Thread 0x00003f8c (most recent call first): File "E:\Anaconda3\envs\sam\lib\site-packages\wandb\sdk\lib\redirect.py", line 640 in write File "E:\Anaconda3\envs\sam\lib\threading.py", line 1202 in invoke_excepthook File "E:\Anaconda3\envs\sam\lib\threading.py", line 934 in _bootstrap_inner File "E:\Anaconda3\envs\sam\lib\threading.py", line 890 in _bootstrap

Current thread 0x00005b04 (most recent call first):

m43 commented 1 year ago

Hej, it appears you're encountering "Out of Memory" (OOM) errors on your GPU. If the video you're processing is too long or the model/data demands are high, consider these solutions:

  1. Reduce the Number of Frames: You can modify the command line argument for the maximum number of frames, for example, by setting max_frames=10.
  2. Use a Lightweight Variant of SAM: Consider replacing the default ViT-Huge backbone in SAM with a ViT-Tiny by using MobileSAM or Light HQ-SAM, as instructed here.
  3. Consider GPU Memory Limitations: Your GPU has 2GB of memory, which may be on the lower end for some of the tasks in our repo. I typically run tests with either 12GB or 48GB of GPU memory. If possible, you might find it beneficial to use a GPU with higher memory capacity.

Hope this helps! Please let me know if you have further questions.