Open doruksonmez opened 3 weeks ago
@nv-jeff maybe jeff could help you, I never used isaac replicator sorry.
@TontonTremblay Thanks for your answer, I would be really appreciated if @nv-jeff could also chime in. Meantime, would you have a comment on the ongoing training (loss/maps etc.)?
I think you would want 0.001 around loss, not sure would need to check other issue threads. Also maybe your object has symmetries in there. Check the section on generating data with symmetries.
ahh you are doing the ycb wood box, yeah there are symmetries on that object. You will have to define them.
I got you but this object is not generated by NVISII, so where am I supposed to define model_info.json? I just have .pngs and .jsons in my dataset. I don't see any option for it in the train.py either. Is this supposed be in Isaac Sim's data generation config? This is my dope_config.yml file within the Isaac Sim Replicator directory:
---
# Default rendering parameters
CONFIG:
renderer: RayTracedLighting
headless: false
width: 512
height: 512
# prim_type is determined by the usd file.
# To determine, open the usd file in Isaac Sim and see the prim path. If you load it in /World, the path will be /World/<prim_type>
OBJECTS_TO_GENERATE:
#- { part_name: 003_cracker_box, num: 1, prim_type: _03_cracker_box }
#- { part_name: 035_power_drill, num: 1, prim_type: _35_power_drill }
- { part_name: 036_wood_block, num: 1, prim_type: _36_wood_block }
# Maximum force component to apply to objects to keep them in motion
FORCE_RANGE: 30
# Camera Intrinsics
WIDTH: 512
HEIGHT: 512
F_X: 768
F_Y: 768
pixel_size: 0.04 # in mm
# Number of sphere lights added to the scene
NUM_LIGHTS: 6
# Minimum and maximum distances of objects away from the camera (along the optical axis)
# MIN_DISTANCE: 1.0
MIN_DISTANCE: 0.4
# MAX_DISTANCE: 2.0
MAX_DISTANCE: 1.4
# Rotation of camera rig with respect to world frame, expressed as XYZ euler angles
CAMERA_RIG_ROTATION:
- 0
- 0
- 0
# Rotation of camera with respect to camera rig, expressed as XYZ euler angles. Please note that in this example, we
# define poses with respect to the camera rig instead of the camera prim. By using the rig's frame as a surrogate for
# the camera's frame, we effectively change the coordinate system of the camera. When
# CAMERA_RIG_ROTATION = np.array([0, 0, 0]) and CAMERA_ROTATION = np.array([0, 0, 0]), this corresponds to the default
# Isaac-Sim camera coordinate system of -z out the face of the camera, +x to the right, and +y up. When
# CAMERA_RIG_ROTATION = np.array([0, 0, 0]) and CAMERA_ROTATION = np.array([180, 0, 0]), this corresponds to
# the YCB Video Dataset camera coordinate system of +z out the face of the camera, +x to the right, and +y down.
CAMERA_ROTATION:
- 180
- 0
- 0
# Minimum and maximum XYZ euler angles for the part being trained on to be rotated, with respect to the camera rig
MIN_ROTATION_RANGE:
- -180
- -90
- -180
# Minimum and maximum XYZ euler angles for the part being trained on to be rotated, with respect to the camera rig
MAX_ROTATION_RANGE:
- 180
- 90
- 180
# How close the center of the part being trained on is allowed to be to the edge of the screen
FRACTION_TO_SCREEN_EDGE: 0.9
# MESH and DOME datasets
SHAPE_SCALE:
- 0.05
- 0.05
- 0.05
SHAPE_MASS: 1
OBJECT_SCALE:
- 1
- 1
- 1
OBJECT_MASS: 1
TRAIN_PART_SCALE: # Scale for the training objects
- 1
- 1
- 1
# Asset paths
DISTRACTOR_ASSET_PATH: /Isaac/Props/YCB/Axis_Aligned/
TRAIN_ASSET_PATH: /Isaac/Props/YCB/Axis_Aligned/
DOME_TEXTURE_PATH: /NVIDIA/Assets/Skies/
# MESH dataset
NUM_MESH_SHAPES: 400
NUM_MESH_OBJECTS: 150
MESH_FRACTION_GLASS: 0.15
MESH_FILENAMES:
- 002_master_chef_can
- 004_sugar_box
- 005_tomato_soup_can
- 006_mustard_bottle
- 007_tuna_fish_can
- 008_pudding_box
- 009_gelatin_box
- 010_potted_meat_can
- 011_banana
- 019_pitcher_base
- 021_bleach_cleanser
- 024_bowl
- 025_mug
- 035_power_drill
- 036_wood_block
- 037_scissors
- 040_large_marker
- 051_large_clamp
- 052_extra_large_clamp
- 061_foam_brick
# DOME dataset
NUM_DOME_SHAPES: 30
NUM_DOME_OBJECTS: 20
DOME_FRACTION_GLASS: 0.2
DOME_TEXTURES:
- Clear/evening_road_01_4k
- Clear/kloppenheim_02_4k
- Clear/mealie_road_4k
- Clear/noon_grass_4k
- Clear/qwantani_4k
- Clear/signal_hill_sunrise_4k
- Clear/sunflowers_4k
- Clear/syferfontein_18d_clear_4k
- Clear/venice_sunset_4k
- Clear/white_cliff_top_4k
- Cloudy/abandoned_parking_4k
- Cloudy/champagne_castle_1_4k
- Cloudy/evening_road_01_4k
- Cloudy/kloofendal_48d_partly_cloudy_4k
- Cloudy/lakeside_4k
- Cloudy/sunflowers_4k
- Cloudy/table_mountain_1_4k
- Evening/evening_road_01_4k
- Indoor/adams_place_bridge_4k
- Indoor/autoshop_01_4k
- Indoor/bathroom_4k
- Indoor/carpentry_shop_01_4k
- Indoor/en_suite_4k
- Indoor/entrance_hall_4k
- Indoor/hospital_room_4k
- Indoor/hotel_room_4k
- Indoor/lebombo_4k
- Indoor/old_bus_depot_4k
- Indoor/small_empty_house_4k
- Indoor/studio_small_04_4k
- Indoor/surgery_4k
- Indoor/vulture_hide_4k
- Indoor/wooden_lounge_4k
- Night/kloppenheim_02_4k
- Night/moonlit_golf_4k
- Storm/approaching_storm_4k
@TontonTremblay @nv-jeff, my training just completed even though the loss value looks okay, I can not get any inference results with inference/inference.py
. What would be the reason for it?
Training output:
TensorBoard:
Inference:
can you share belief maps i think it is show_beliefs
something like that
Only thing I could find related to show_beliefs
was --showbelief
flag in train2/inference.py
. I tried to run it like:
python3 inference.py --data ../isaac_data/test/ --showbelief
but it throws:
Traceback (most recent call last):
File "/isaac_dope_retrain/Deep_Object_Pose/train2/inference.py", line 22, in <module>
from cuboid import Cuboid3d
ModuleNotFoundError: No module named 'cuboid'
So I have changed
import sys
sys.path.append("inference")
to
import sys
sys.path.append("../common")
but this time it throws:
I also set the following fields in train2/config_inference/config_pose.yaml
:
weights: {
'_36_wood_block':"/isaac_dope_retrain/Deep_Object_Pose/train/output_wood_block/weights/net_epoch_99.pth"
}
architectures: {
'_36_wood_block':"dope"
}
dimensions: {
"_36_wood_block": [9.6024150848388672,19.130100250244141,5.824894905090332]
}
class_ids: {
'_36_wood_block': 37
}
draw_colors: {
"_36_wood_block": [217,12, 232] # magenta
}
How can I get belief map images? Sorry, I could not find any explanation for this in the repo. Thank you for your support.
I see this part in the train/train.py
but I can't find any images related to it. It's like it doesn't save those somehow:
I have a view in TensorBoard for the epoch 176. Would it be useful for you @TontonTremblay ?
yeah it looks like you have symmetries. see how for some points it does not know which one is which? you need to add them to your annotation. https://github.com/NVlabs/Deep_Object_Pose/blob/master/data_generation/readme.md#handling-objects-with-symmetries read this.
@TontonTremblay thanks for your answer. I wanted to use Ketchup as a reference point to see if I could get an expected result before your answer. I have generated ~3900 images using Blenderproc. What do you think about my belief maps while it is in ~165th epoch:
These look great to my eye.
On Wed, Aug 21, 2024 at 12:25 Doruk Sönmez @.***> wrote:
@TontonTremblay https://github.com/TontonTremblay thanks for your answer. I wanted to use Ketchup as a reference point to see if I could get an expected result before your answer. I have generated ~3900 images using Blenderproc. What do you think about my belief maps while it is in ~165th epoch: Screenshot.from.2024-08-21.22-08-44.png (view on web) https://github.com/user-attachments/assets/0fb7a4b5-a2cf-4893-87a3-c8ebfbdd88fc
— Reply to this email directly, view it on GitHub https://github.com/NVlabs/Deep_Object_Pose/issues/381#issuecomment-2302855464, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABK6JIBBAQBS4BIZ6ATBLPLZSTSRPAVCNFSM6AAAAABMYTQLXKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBSHA2TKNBWGQ . You are receiving this because you were mentioned.Message ID: @.***>
I think I'm getting somewhere with this but my inference results are not good. I was able to run train2/inference.py
with small modifications to the detector.py
line 496
(which prevented running with an error):
#belief -= float(torch.min(belief)[0].data.cpu().numpy())
#belief /= float(torch.max(belief)[0].data.cpu().numpy())
belief -= float(torch.min(belief).data.cpu().numpy())
belief /= float(torch.max(belief).data.cpu().numpy())
Here is a couple belief maps from the actual train2/inference.py
script and pose estimation results:
In some of the images, there is no result at all:
I've already decreased the threshold to 0.1
and still a maximum of 1 result appears on the output. What might be the issue here?
Can you try on an image with a single ketchup bottle? Also I think your cuboid keypoints order is looking strange. I forgot if you used Isaac sin or nvisii.
On Thu, Aug 22, 2024 at 07:33 Doruk Sönmez @.***> wrote:
I think I'm getting somewhere with this but my inference results are not good. I was able to run train2/inference.py with small modifications to the detector.py line 496 (which prevented running with an error):
belief -= float(torch.min(belief)[0].data.cpu().numpy())
belief /= float(torch.max(belief)[0].data.cpu().numpy())
belief -= float(torch.min(belief).data.cpu().numpy()) belief /= float(torch.max(belief).data.cpu().numpy())
Here is a couple belief maps from the actual train2/inference.py script and pose estimation results: file_12.png (view on web) https://github.com/user-attachments/assets/e5ae3d08-3a49-4647-b4b6-6e0cd1a2e0b0 file_12_belief.png (view on web) https://github.com/user-attachments/assets/d77f5e31-dd8f-4c50-8be3-a76462b6a010
file_25.png (view on web) https://github.com/user-attachments/assets/961621ad-d6ac-4908-81af-8b07afbed3ef file_25_belief.png (view on web) https://github.com/user-attachments/assets/a0a46dbe-640c-4659-989c-884fa2eb9ec4
In some of the images, there is no result at all: file_28.png (view on web) https://github.com/user-attachments/assets/361ac12d-91c7-4295-8a8a-c76c92389817 file_28_belief.png (view on web) https://github.com/user-attachments/assets/eb73dc97-8022-4523-9d21-64e6dffab7ae
What might be the issue?
— Reply to this email directly, view it on GitHub https://github.com/NVlabs/Deep_Object_Pose/issues/381#issuecomment-2304829180, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABK6JIFBXCWK5T74AJPUMZDZSXZC7AVCNFSM6AAAAABMYTQLXKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBUHAZDSMJYGA . You are receiving this because you were mentioned.Message ID: @.***>
I used Blendproc for this Ketchup dataset. Here is the picture with one Ketchup bottle:
And the belief map:
And this one is without distractors:
Ok I see the problème. It is the order of the cuboid in blender proc that is different than nvisii. I am out this week. Ping nv Jeff so he can check into that or fix one or the other. Might be adding a flab from where the data comes from.
On Thu, Aug 22, 2024 at 09:52 Doruk Sönmez @.***> wrote:
And this one is without distractors: 000000.png (view on web) https://github.com/user-attachments/assets/8dfdb915-cd8b-4444-8a6d-0c6cf5521ed8
000000_belief.png (view on web) https://github.com/user-attachments/assets/b462f4ba-e810-4a49-aed8-09173ed2e53b
— Reply to this email directly, view it on GitHub https://github.com/NVlabs/Deep_Object_Pose/issues/381#issuecomment-2305217428, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABK6JIAZOL4JMOHH2NNJPP3ZSYJLLAVCNFSM6AAAAABMYTQLXKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBVGIYTONBSHA . You are receiving this because you were mentioned.Message ID: @.***>
@TontonTremblay Okay, many thanks for clarifying. @nv-jeff could we please get your help for this issue? Thanks in advance.
Hi, first of all, sorry for the long post. I'm currently experimenting with training DOPE to understand the project fully and its theoretical background. First, I generated synthetic data using Blenderproc since NVISII is no longer supported by newer Python versions. However, I've missed the point where I should be running the below command 5 times as mentioned here: https://github.com/NVlabs/Deep_Object_Pose/issues/361#issue-2322934867, data generation took so much time and I ended up only having ~1600 images after a long time.
./run_blenderproc_datagen.py --nb_runs 5 --nb_frames 1000 --path_single_obj /isaac_dope_retrain/Popcorn/google_16k/textured.obj --nb_objects 6 --distractors_folder ../nvisii_data_gen/google_scanned_models --nb_distractors 10 --backgrounds_folder ../dome_hdri_haven/
Then I followed NVIDIA's official documentation on generating synthetic dataset using Isaac Sim Replicator and generated ~7200 images for the
036_wood_block
class which is from the YCB dataset. Then I separated 202 images for testing and run debug tool on this split like the following:python3 debug.py --data ../isaac_data/test/
Here is the debug result:
My first question is, do these debug results look correct for the object? (looks like there is no connection to the 8th point)
Then I started DOPE training with the
train/train.py
script with a batch size of 16 and for 200 epochs:python3 -m torch.distributed.launch --nproc_per_node=1 train.py --batchsize 16 -e 200 --data ../isaac_data/isaac_data/ --object _36_wood_block
Here is my training log, TensorBoard, and belief maps so far:
My second question is, do these loss values/tensorboard/belief maps indicate a good training process? (I'm asking because training really takes time on RTX 4070 Ti and if something is wrong, I would like to fix it and restart the training)
My third question is; I have noticed some differences between NVISII/Blenderproc data and Isaac Sim Replicator data, so these changes would affect my training or inference process? Are there any other steps that should apply to Isaac data before or after the training to get the expected results?
Sample data for
cracker
:Wood block data from Isaac Sim Replicator:
Thanks in advance.