kennethwdk / LocLLM

Code for "LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model", CVPR 2024 Highlight
MIT License
31 stars 3 forks source link

Restricted rounds? #2

Closed Junhojuno closed 4 months ago

Junhojuno commented 4 months ago

Hi:) first of all, Thank you for the nice work!

Reviewing the code, I found a curious point which is not explained in the paper. in coco.py, it looks like round is restricted to five, but coco has 17 keypoints(rounds). is there any reason using 5 rounds, not full rounds?

 for idx in range(5):
      if idx >= len(kpt_des): break
      if self.conv_format == 'keypoint':
          q1 = "Where is the {} of this person in this image? Please provide its coordinates.".format(kpt_name[idx])

If there is anything that i misunderstood, please tell me! thanks😄

Junhojuno commented 4 months ago

I think, it results from VRAM issue. 5 keypoints' description is the maximum length of prompt in RTX 3090(24GB). because, if setting the description to much longer than COCO, the training script outputs "cuda-out-of-memory error".

ni123die commented 3 weeks ago

HI!Can you run the command bash scripts/valid_h36m.sh?

image