Closed iishiishii closed 6 months ago
The process was killed at large 3D bounding box in validation set.
oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=docker-24ad385f8a96bf9d5e617e51c92441a093ce94f97652613df1397c7b5d7965ce.scope,mems_allowed=0,oom_memcg=/system.slice/docker-24ad385f8a96bf9d5e617e51c92441a093ce94f97652613df1397c7b5d7965ce.scope,task_memcg=/system.slice/docker-24ad385f8a96bf9d5e617e51c92441a093ce94f97652613df1397c7b5d7965ce.scope,task=python3,pid=87005,uid=999 [Wed Apr 3 15:07:55 2024] Memory cgroup out of memory: Killed process 87005 (python3) total-vm:13921856kB, anon-rss:8075896kB, file-rss:258672kB, shmem-rss:4kB, UID:999 pgtables:21360kB oom_score_adj:0
Steps to reproduce:
python CVPR24_LiteMedSamOnnx_infer.py -i ~/rdm/data/datasets/validation/imgs/ -o validation/segs
or using Docker docker container run -m 8G --name litemedsam --rm -v ~/rdm/data/datasets/validation/imgs/:/workspace/inputs/ -v $PWD/validation/litemedsam-seg/:/workspace/outputs/ litemedsam:latest /bin/bash -c "sh predict.sh"
@nanthan987 Let me know if it doesn't reproduce from your sidePytorch profiler
The memory peak due to show_mask function. After excluding it and other unused packages, the memory usage is ~1GB
Goal: Reduce model size and runtime while keeping the accuracy
TODO:
export to ONNX: accuracy remains the same and runtime is slightly better for 2D but a bit worse for 3D images
optimize ONNX model by combining some OPs: online graph optimization doesn't improve runtime but offline optimization does