danbider / lightning-pose

Accelerated pose estimation and tracking using semi-supervised convolutional networks.
MIT License
235 stars 34 forks source link

How to select resize_dim? #170

Closed Wulin-Tan closed 3 months ago

Wulin-Tan commented 3 months ago

Hi, LP team: based on your tutorial, resize_dim can be {64, 128, 256, 384, 512}. and I found that on my experiment, 256 is similar to 512 or even better in some frames(although not 100% perfect), like some predicted keypoints would go together in 512 but can be separated to be close to the ground truth. and 256 consumes less GPU memory. So can you give more details about how to choose from this resize_dim?

themattinthehatt commented 3 months ago

@Wulin-Tan I believe you have quite large images, correct? Like 2000x2000 or so? We haven't spent a lot of time with datasets that have frame sizes this large. That's great if you're finding good results with 256x256. The main thing to keep in mind is that the larger the resize dims, the larger the heatmaps, and the more memory will be consumed during training/inference. Also, if the heatmaps are too large, we've also seen cases where the model can have trouble learning, which might be why you're not seeing better performance with 512x512. You could try 384x384 and see if that improves over 256x256. I'd say my advice is probably not too insightful - try some different values and see what works for you!

Wulin-Tan commented 3 months ago

@themattinthehatt Yes, my frame size is big, about 2000X2000, but the animal is only in a small part of the frame(a mouse in a big open field experiment). I tried what you mentioned like 256, 384, 512, and found that 256 is less hardware intensive and has similar result to 384/512, so I finally settled down to 256.