droid-dataset / droid

Distributed Robot Interaction Dataset.
https://droid-dataset.github.io/droid/
106 stars 18 forks source link

Where to find the camera intrinsics? #4

Open Nimolty opened 5 months ago

Nimolty commented 5 months ago

Hello! Thanks for opening-source this great works! I am trying to analyze the raw data (27TB). However, I was wondering where the camera intrinsics are. Could you please provide some information.

Nimolty commented 5 months ago

Futhermore, I would like to ask where to find the raw depth image in the raw data (27TB). Thank you very much!

Nimolty commented 5 months ago

Thanks, I have found the code in the droid/scripts/convert/svo_to_mp4.py where I can trasform the original .svo file into rgb image, depth and pointcloud. However, i find that there are a lot of "nan" in depth and pointcloud. Could you please give me some advice about this?

ashwin-balakrishna96 commented 5 months ago

Hi Nimolty:

Thanks for the questions, I created a draft PR here that should help answer most of your questions and show how you can work with the provided depth information in DROID. To summarize:

  1. The camera intrinsics can be accessed here.
  2. We do not store raw depth data in DROID, but you can use the ZED stereo depth model to get depth estimates if you want (this is the example shown in the PR above). In my experience, these depth estimates are not very good, but we have found that we can obtain high quality depth estimates using recent stereo depth models given the camera intrinsics and baseline. I would recommend leveraging these to obtain better depth estimates.
  3. The NAN values in the ZED depth estimates indicate that the depth could not be estimated correctly due to occlusion or if it is an outlier (see description here). You can see how I deal with these when visualizing things in the PR, but for better results I would recommend using a more sophisticated stereo depth model. Internally at TRI, we have found very good results by using the stereo model proposed in this paper, but unfortunately we are not quite ready to open-source it yet.
Jay-Ye commented 5 months ago

Thanks for the answer! Would you please consider including the camera intrinsic to the metadata*.json of each episode? It would be much more convenient to get direct access to both the extrinsic and intrinsic without necessarily downloading the whole 27TB data.

Nimolty commented 5 months ago

Thank you for your detailed advice! For the second point, it is said high quality depth estimation can be achieved by recent stereo depth models. Would you please recommend some options? Are these models contained in the ZED python api or we can only find some offline data generator?

ashwin-balakrishna96 commented 5 months ago

You can try something like this. This is a lower quality open-source version of an internal model that's been working pretty well for us at TRI.

Zhangwenyao1 commented 3 months ago

Thanks for your great work! I want to know if this link contains the extrinsic matix of the three camera in paper? If not,how can I get the extrinsic matrix?

kpertsch commented 3 months ago

The extrinsics information is published as part of the metadata in the droid_raw subset, see here: https://droid-dataset.github.io/droid/the-droid-dataset.html#accessing-raw-data (in the metadata-*.json files I believe)

StarCycle commented 3 months ago

@kpertsch @ashwin-balakrishna96 This is an example of the metadata of an epoch in the raw dataset. If I understand correctly, wrist_cam_extrinsics is only the initial extrinsic matrix of an episode (since the wrist camera is moving), right?

And thank you for propose #5 so we can get depth and intrinsics from the svo files of the raw dataset!

If possible, please add the depth, intrinsics and extrinsics to the RLDS dataset!

{"uuid": "AUTOLab+5d05c5aa+2023-07-07-09h-42m-23s", 
"lab": "AUTOLab", 
"user": "Zehan Ma",
"user_id": "5d05c5aa", 
"date": "2023-07-07", 
"timestamp": "2023-07-07-09h-42m-23s", 
"hdf5_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/trajectory.h5", 
"building": "BAIR", 
"scene_id": 5207831207, 
"success": true, 
"robot_serial": "fr3-295341-1326595", 
"r2d2_version": "1.3", 
"current_task": "Use cup to pour something granular (ex: nuts, rice, dried pasta, coffee beans)", 
"trajectory_length": 472, 
"wrist_cam_serial": "18026681", 
"ext1_cam_serial": "22008760", 
"ext2_cam_serial": "24400334", 
"wrist_cam_extrinsics": [0.26852859729525597, 0.12468921693797806, 0.38842643469874216, 2.602529290663223, -0.1020245067022938, 1.8903797737369048], 
"ext1_cam_extrinsics": [0.4039752945788883, 0.47318839256292644, 0.27170584157181743, -1.6827143529296786, 0.07550227077885108, -2.668292283962724], 
"ext2_cam_extrinsics": [0.2596757315060087, -0.36626259649963777, 0.24849304837972613, -1.742115402153725, -0.0012127426938948194, -0.7149867215760838], 
"wrist_svo_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/SVO/18026681.svo", 
"wrist_mp4_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/MP4/18026681.mp4", 
"ext1_svo_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/SVO/22008760.svo", 
"ext1_mp4_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/MP4/22008760.mp4", 
"ext2_svo_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/SVO/24400334.svo", 
"ext2_mp4_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/MP4/24400334.mp4", 
"left_mp4_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/MP4/22008760.mp4", 
"right_mp4_path": "success/2023-07-07/Fri_Jul__7_09:42:23_2023/recordings/MP4/24400334.mp4"}
kpertsch commented 3 months ago

Re adding things to RLDS: I will try to add the extrinsic info when I recompile the RLDS data! Depth increases dataset size a lot since it's not compressible and needs to be stored as float32 tensors (and thereby makes loading during training slower), so we intentionally didn't include it, but try to provide utilities for people to compute it themselves if they would like to!

StarCycle commented 3 months ago

I can understand! But would you like to also add intrinsics to the RLDS data? @kpertsch

I want to train a policy that can adapt to any intrinsics / extrinsics. My plan is to include such info in the policy input...

kpertsch commented 3 months ago

Got it -- yeah we can add intrinsic info too

StarCycle commented 3 months ago

Great thanks!

oym1994 commented 1 month ago

Re adding things to RLDS: I will try to add the extrinsic info when I recompile the RLDS data! Depth increases dataset size a lot since it's not compressible and needs to be stored as float32 tensors (and thereby makes loading during training slower), so we intentionally didn't include it, but try to provide utilities for people to compute it themselves if they would like to!

Hi, have you finished this RLDS recompiling? I also need it!

StarCycle commented 1 month ago

Hi @kpertsch One more question: Current extrinsics are for the 3rf person cameras. How to get the extrinsic of the wrist camera?