NVIDIA-AI-IOT / trt_pose

Real-time pose estimation accelerated with NVIDIA TensorRT
MIT License
972 stars 287 forks source link

How to increase Model accuracy #59

Open AmoghBharadwaj opened 4 years ago

AmoghBharadwaj commented 4 years ago

Hello Team,

I am trying to run Human Pose estimation on video data and the model seems to miss the detections say for ex when the person is stretching his hands, tilts a bit, often gives wrong detections. Can you please let me know how to increase the model predictions accuracy ? Thank you.

jaybdub commented 4 years ago

Hi AmoghBharadwaj,

Thanks for reaching out.

Which model are you running? Typically the densenet121 model yields better accuracy than the resnet18 model.

You could also experiment with adjusting the image resolution.

Aside from that, training or experimenting with model architectures is likely necessary.

Please let me know if this helps or you have any questions.

Best, John

AmoghBharadwaj commented 4 years ago

Hello @jaybdub ,

I am using resnet model, will definitely try out densenet.

Also, can you please shed more light on training and experimenting with model architectures and how to do it ?

Thank you.

jaybdub commented 4 years ago

Sure,

Training a Model

  1. Download coco

    cd tasks/human_pose
    source download_coco.sh
    unzip train2017.zip
    unzip val2017.zip
    unzip annotations_trainval2017.zip
  2. Pre-process the coco annotations. This adds the "Neck" keypoint (midpoint of shoulders).

    python3 preprocess_coco_person.py annotations/person_keypoints_train2017.json annotations/person_keypoints_train2017_modified.json
  3. Add your model to this list to register it https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/trt_pose/models/__init__.py#L7

  4. Create a model / training configuration. Easiest to start from an existing one.

    cp experiments/resnet18_baseline_att_224x224_A.json experiments/my_model.json
  5. Set the model and arguments corresponding to how you defined / registered your model (see https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json#L48 for example)

When you're defining you model, the main requirement is

  1. It takes an input image
  2. It returns cmap and paf feature maps. The spatial shape should match the target you set when training (https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json#L7)

I'd start trying densenet121. The models listed in the README are the product of a fair amount of experimentation. But there definitely may be improvements you could make. It's possible these improvements may have more to do with other factors such as improvements to the training script, data augmentation methods than architecture itself, which would require more modification to the actual code, but it could be architecture as well.

Hope this helps, let me know if you run into any issues.

Best, John

dreamilk commented 3 years ago

Sure,

Training a Model

  1. Download coco
    cd tasks/human_pose
    source download_coco.sh
    unzip train2017.zip
    unzip val2017.zip
    unzip annotations_trainval2017.zip
  2. Pre-process the coco annotations. This adds the "Neck" keypoint (midpoint of shoulders).
    python3 preprocess_coco_person.py annotations/person_keypoints_train2017.json annotations/person_keypoints_train2017_modified.json
  3. Add your model to this list to register it https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/trt_pose/models/__init__.py#L7
  4. Create a model / training configuration. Easiest to start from an existing one.
    cp experiments/resnet18_baseline_att_224x224_A.json experiments/my_model.json
  5. Set the model and arguments corresponding to how you defined / registered your model (see https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json#L48 for example)

When you're defining you model, the main requirement is

  1. It takes an input image
  2. It returns cmap and paf feature maps. The spatial shape should match the target you set when training (https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json#L7)

I'd start trying densenet121. The models listed in the README are the product of a fair amount of experimentation. But there definitely may be improvements you could make. It's possible these improvements may have more to do with other factors such as improvements to the training script, data augmentation methods than architecture itself, which would require more modification to the actual code, but it could be architecture as well.

Hope this helps, let me know if you run into any issues.

Best, John

@jaybdub How can I execute the training script after completing the above steps? I have configured the JSON file. How can I use it and how can I train? Can you tell me more about it? Thank you very much!

tucachmo2202 commented 3 years ago

@dreamilk have you retrained successfully? I want to retrain with my dataset and I am not clear about how to do. Could you tell me the way? Thanks very much!

dreamilk commented 3 years ago

@dreamilk have you retrained successfully? I want to retrain with my dataset and I am not clear about how to do. Could you tell me the way? Thanks very much!

I have tried training, and some of the official JSON files can be used for training. If I remember correctly, train.py It's for training. train.py There may be errors, so you need to modify the problem yourself. For example, _from. Models import models_ may need to be modified to _from trtpose.Models import models****. The specific situation needs specific analysis. Good luck!

tucachmo2202 commented 3 years ago

@dreamilk thank for your reply. I am making my own dataset (almost are camera images) and I concern about accuracy you got or is this model suitable for action detection problem?

spacewalk01 commented 3 years ago

@jaybdub how to train a larger model with 416x416 or 608x608. The current model size is quiet small and cannot detect small objects.

chh7411898 commented 2 years ago

can you share the model? thanks

tucachmo2202 commented 2 years ago

@jaybdub how to train a larger model with 416x416 or 608x608. The current model size is quiet small and cannot detect small objects.

Hi, hope you still need. You can retrain model follow this guide. And change "image_shape" in config file to 416 or 608, "target_shape" = "image_shape"/4. For example if image_shape is [416, 416], target_shape is [104, 104].