NVlabs / Deep_Object_Pose

Deep Object Pose Estimation (DOPE) – ROS inference (CoRL 2018)
Other
1.02k stars 284 forks source link

Unable to attain detection on custom object #370

Open phsilvarepo opened 3 months ago

phsilvarepo commented 3 months ago

Hello there,

I have been working on a custom DOPE detector for a drill, but I seem to be stuck with bad inference results not quite sure why. I have generated the synthetic data using Issac Sim, more specifically: Synthetic Pose Data Generation. As for the training I have generated 20K images and trained for 60 epochs. When it comes to the inference I keep getting "incomplete cuboid detection". The images I used for inference were some of the training images. Beneath I display the config file. Would really appreciate some help. I could generate a new dataset with less flying distractors since it could be due to the complexity but kinda unsure where to go from now. Thanks.

Example of synthetic images:

000014 000048 000044 000045

Example of synthetic label: { "camera_data": {}, "objects": [ { "class": "macho", "visibility": 0.005800008773803711, "location": [ 26.552387237548828, -18.970972061157227, 135.05587768554688 ], "quaternion_wxyz": [ 0.8684391379356384, -0.344438374042511, -0.3306385278701782, 0.1336183100938797 ], "projected_cuboid": [ [ 405.87103271484375, 158.63209533691406 ], [ 393.4239501953125, 148.6797332763672 ], [ 421.2604064941406, 296.8967590332031 ], [ 435.88494873046875, 306.78857421875 ], [ 420.6507568359375, 147.55694580078125 ], [ 408.212646484375, 137.3334197998047 ], [ 439.6685791015625, 285.5958251953125 ], [ 454.2653503417969, 295.814208984375 ], [ 407.0226745605469, 148.0983428955078 ] ] } ] }

Training output: Screenshot from 2024-06-25 12-07-52

INFERENCE

Screenshot from 2024-06-25 12-07-36(1)

config:

topic_camera: "/dope/rgb"
topic_camera_info: "/dope/camera_info"
topic_publishing: "dope"
input_is_rectified: True   # Whether the input image is rectified (strongly suggested!)
downscale_height: 400      # if the input image is larger than this, scale it down to this pixel height

# Comment any of these lines to prevent detection / pose estimation of that object
weights: {
    "macho":"/home/rics/catkin_ws/src/Deep_Object_Pose/train/output/weights/net_epoch_60.pth",
    # "gelatin":"package://dope/weights/gelatin_60.pth",
    # "meat":"package://dope/weights/meat_20.pth",
    # "mustard":"package://dope/weights/mustard_60.pth",
    #"soup":"package://dope/weights/soup_60.pth",
    #"sugar":"package://dope/weights/sugar_60.pth",
    # "bleach":"package://dope/weights/bleach_28_dr.pth"

    # NEW OBJECTS - HOPE
    # "AlphabetSoup":"package://dope/weights/AlphabetSoup.pth", 
    # "BBQSauce":"package://dope/weights/BBQSauce.pth", 
    # "Butter":"package://dope/weights/Butter.pth", 
    # "Cherries":"package://dope/weights/Cherries.pth", 
    # "ChocolatePudding":"package://dope/weights/ChocolatePudding.pth", 
    # "Cookies":"package://dope/weights/Cookies.pth", 
    # "Corn":"package://dope/weights/Corn.pth", 
    # "CreamCheese":"package://dope/weights/CreamCheese.pth", 
    # "GreenBeans":"package://dope/weights/GreenBeans.pth", 
    # "GranolaBars":"package://dope/weights/GranolaBars.pth", 
    # "Ketchup":"package://dope/weights/Ketchup.pth", 
    # "MacaroniAndCheese":"package://dope/weights/MacaroniAndCheese.pth", 
    # "Mayo":"package://dope/weights/Mayo.pth", 
    # "Milk":"package://dope/weights/Milk.pth", 
    # "Mushrooms":"package://dope/weights/Mushrooms.pth", 
    # "Mustard":"package://dope/weights/Mustard.pth", 
    # "Parmesan":"package://dope/weights/Parmesan.pth", 
    # "PeasAndCarrots":"package://dope/weights/PeasAndCarrots.pth",
    # "Peaches":"package://dope/weights/Peaches.pth",
    # "Pineapple":"package://dope/weights/Pineapple.pth",
    # "Popcorn":"package://dope/weights/Popcorn.pth",
    # "OrangeJuice":"package://dope/weights/OrangeJuice.pth", 
    # "Raisins":"package://dope/weights/Raisins.pth",
    # "SaladDressing":"package://dope/weights/SaladDressing.pth",
    # "Spaghetti":"package://dope/weights/Spaghetti.pth",
    # "TomatoSauce":"package://dope/weights/TomatoSauce.pth",
    # "Tuna":"package://dope/weights/Tuna.pth",
    # "Yogurt":"package://dope/weights/Yogurt.pth",

}

# Cuboid dimension in cm x,y,z
dimensions: {
    "macho": [1.2,1.2,11.4],
}

class_ids: {
    "macho": 1,
}

draw_colors: {
    "macho": [13, 255, 128],  # green
}

# optional: provide a transform that is applied to the pose returned by DOPE
model_transforms: {
#    "cracker": [[ 0,  0,  1,  0],
#                [ 0, -1,  0,  0],
#                [ 1,  0,  0,  0],
#                [ 0,  0,  0,  1]]
}

# optional: if you provide a mesh of the object here, a mesh marker will be
# published for visualization in RViz
# You can use the nvdu_ycb tool to download the meshes: https://github.com/NVIDIA/Dataset_Utilities#nvdu_ycb
meshes: {
#    "cracker": "file://path/to/Dataset_Utilities/nvdu/data/ycb/aligned_cm/003_cracker_box/google_16k/textured.obj",
#    "gelatin": "file://path/to/Dataset_Utilities/nvdu/data/ycb/aligned_cm/009_gelatin_box/google_16k/textured.obj",
#    "meat":    "file://path/to/Dataset_Utilities/nvdu/data/ycb/aligned_cm/010_potted_meat_can/google_16k/textured.obj",
#    "mustard": "file://path/to/Dataset_Utilities/nvdu/data/ycb/aligned_cm/006_mustard_bottle/google_16k/textured.obj",
#    "soup":    "file://path/to/Dataset_Utilities/nvdu/data/ycb/aligned_cm/005_tomato_soup_can/google_16k/textured.obj",
#    "sugar":   "file://path/to/Dataset_Utilities/nvdu/data/ycb/aligned_cm/004_sugar_box/google_16k/textured.obj",
#    "bleach":  "file://path/to/Dataset_Utilities/nvdu/data/ycb/aligned_cm/021_bleach_cleanser/google_16k/textured.obj",
}

# optional: If the specified meshes are not in meters, provide a scale here (e.g. if the mesh is in centimeters, scale should be 0.01). default scale: 1.0.
mesh_scales: {
    "macho": 0.01,
}

overlay_belief_images: True   # Whether to overlay the input image on the belief images published on /dope/belief_[obj_name]

# Config params for DOPE
thresh_angle: 0.5
thresh_map: 0.01
sigma: 3
thresh_points: 0.1

The inference results are empty with no detections.

intelligencestreamlabs commented 3 months ago

I have a comment about scaling of generated dataset, can you share with us the command for data generation, and which inference script did you use to get results, since there are two scripts one in inference folder and the other in the training folder.

Note from issues: you should use train2 folder to train

phsilvarepo commented 3 months ago

Hello @intelligencestreamlabs,

To generate the dataset, as mentioned I utilized this package. In the terminal I ran:

./python.sh  standalone_examples/replicator/offline_pose_generation/offline_pose_generation.py --writer DOPE --num_mesh 0 --num_dome 20000

This package utilizes Issac Sim 2023.1.1 to generate the data for DOPE training. As for the inference I ran inference/inference.py. I have also noted from other issues the current train/train.py script seems to be broken, I have started training with an older version of the repository. I did not try train2 version but does this script in the current repo work? I can try it later but for now will continue to try an older version.

Thanks for your time

Edit: I have also tried to simplify the generated data, to only 5 flying distractors but this changed nothing in the inference results