Open Nimolty opened 5 months ago
Futhermore, I would like to ask where to find the raw depth image in the raw data (27TB). Thank you very much!
Thanks, I have found the code in the droid/scripts/convert/svo_to_mp4.py where I can trasform the original .svo file into rgb image, depth and pointcloud. However, i find that there are a lot of "nan" in depth and pointcloud. Could you please give me some advice about this?
Hi Nimolty:
Thanks for the questions, I created a draft PR here that should help answer most of your questions and show how you can work with the provided depth information in DROID. To summarize:
Thanks for the answer! Would you please consider including the camera intrinsic to the metadata*.json of each episode? It would be much more convenient to get direct access to both the extrinsic and intrinsic without necessarily downloading the whole 27TB data.
Thank you for your detailed advice! For the second point, it is said high quality depth estimation can be achieved by recent stereo depth models. Would you please recommend some options? Are these models contained in the ZED python api or we can only find some offline data generator?
You can try something like this. This is a lower quality open-source version of an internal model that's been working pretty well for us at TRI.
Thanks for your great work! I want to know if this link contains the extrinsic matix of the three camera in paper? If not,how can I get the extrinsic matrix?
The extrinsics information is published as part of the metadata in the droid_raw subset, see here: https://droid-dataset.github.io/droid/the-droid-dataset.html#accessing-raw-data (in the metadata-*.json
files I believe)
@kpertsch @ashwin-balakrishna96 This is an example of the metadata of an epoch in the raw dataset. If I understand correctly, wrist_cam_extrinsics
is only the initial extrinsic matrix of an episode (since the wrist camera is moving), right?
And thank you for propose #5 so we can get depth and intrinsics from the svo files of the raw dataset!
If possible, please add the depth, intrinsics and extrinsics to the RLDS dataset!
{"uuid": "AUTOLab+5d05c5aa+2023-07-07-09h-42m-23s",
"lab": "AUTOLab",
"user": "Zehan Ma",
"user_id": "5d05c5aa",
"date": "2023-07-07",
"timestamp": "2023-07-07-09h-42m-23s",
"hdf5_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/trajectory.h5",
"building": "BAIR",
"scene_id": 5207831207,
"success": true,
"robot_serial": "fr3-295341-1326595",
"r2d2_version": "1.3",
"current_task": "Use cup to pour something granular (ex: nuts, rice, dried pasta, coffee beans)",
"trajectory_length": 472,
"wrist_cam_serial": "18026681",
"ext1_cam_serial": "22008760",
"ext2_cam_serial": "24400334",
"wrist_cam_extrinsics": [0.26852859729525597, 0.12468921693797806, 0.38842643469874216, 2.602529290663223, -0.1020245067022938, 1.8903797737369048],
"ext1_cam_extrinsics": [0.4039752945788883, 0.47318839256292644, 0.27170584157181743, -1.6827143529296786, 0.07550227077885108, -2.668292283962724],
"ext2_cam_extrinsics": [0.2596757315060087, -0.36626259649963777, 0.24849304837972613, -1.742115402153725, -0.0012127426938948194, -0.7149867215760838],
"wrist_svo_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/SVO/18026681.svo",
"wrist_mp4_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/MP4/18026681.mp4",
"ext1_svo_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/SVO/22008760.svo",
"ext1_mp4_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/MP4/22008760.mp4",
"ext2_svo_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/SVO/24400334.svo",
"ext2_mp4_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/MP4/24400334.mp4",
"left_mp4_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/MP4/22008760.mp4",
"right_mp4_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/MP4/24400334.mp4"}
Re adding things to RLDS: I will try to add the extrinsic info when I recompile the RLDS data! Depth increases dataset size a lot since it's not compressible and needs to be stored as float32 tensors (and thereby makes loading during training slower), so we intentionally didn't include it, but try to provide utilities for people to compute it themselves if they would like to!
I can understand! But would you like to also add intrinsics to the RLDS data? @kpertsch
I want to train a policy that can adapt to any intrinsics / extrinsics. My plan is to include such info in the policy input...
Got it -- yeah we can add intrinsic info too
Great thanks!
Re adding things to RLDS: I will try to add the extrinsic info when I recompile the RLDS data! Depth increases dataset size a lot since it's not compressible and needs to be stored as float32 tensors (and thereby makes loading during training slower), so we intentionally didn't include it, but try to provide utilities for people to compute it themselves if they would like to!
Hi, have you finished this RLDS recompiling? I also need it!
Hi @kpertsch One more question: Current extrinsics are for the 3rf person cameras. How to get the extrinsic of the wrist camera?
Hello! Thanks for opening-source this great works! I am trying to analyze the raw data (27TB). However, I was wondering where the camera intrinsics are. Could you please provide some information.