vye16 / slahmr

MIT License
459 stars 50 forks source link

CUDA out of memory #9

Closed nicolasugrinovic closed 1 year ago

nicolasugrinovic commented 1 year ago

Hi, I get CUDA out-of-memory error when running your model. Specifically, when running PHALP_plus with the following command on a GPU GTX1080ti with 11GB of memory:

cd slahmr/third-party/PHALP_plus; CUDA_VISIBLE_DEVICES=0 python run_phalp.py --base_path slahmr/videos/demo/images/ --video_seq 022691_mpii_test --sample '' --storage_folder slahmr/videos/demo/slahmr/phalp_out --track_dataset posetrack-val --predict TPL --distance_type EQ_010 --encode_type 4c --detect_shots True --track_history 7 --past_lookback 1 --max_age_track 50 --n_init 5 --low_th_c 0.8 --alpha 0.1 --hungarian_th 100 --render_type HUMAN_FULL_FAST --render True --store_mask True --res 256 --render_up_scale 2 --verbose False --overwrite False --use_gt False --batch_id -1 --detection_type mask --start_frame -1

I use the sample video 022691_mpii_test.mp4. The problem is with the detector, the model name seems to be GeneralizedRCNN. Is it normal to require high memory for running PHALP?

Best,

geopavlakos commented 1 year ago

Thanks for pointing that out. Actually, PHALP tends to be more lightweight, however, for best performance in challenging videos, we are using a more accurate object detection model, and since we combine this step with 2D keypoint extraction, the memory requirements for preprocessing are higher (around 12Gb).

We have made a few edits so that you can use a more lightweight detection model (check here). You only need to change the flag in this line from mask_vitdet to mask_regnety. This version should require only 6.5Gb of memory. For best performance though, we still recommend running with the default setting.

nicolasugrinovic commented 1 year ago

Thanks so much for the update and for making the change! I'll try the version with mask_regnety that you refer to.