FlyCole / Dream2Real

[ICRA 2024] Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models
https://www.robot-learning.uk/dream2real
51 stars 6 forks source link

RuntimeError: CUDA error: invalid resource handle #2

Open arenxxxe opened 2 weeks ago

arenxxxe commented 2 weeks ago

Hello Author.

  1. very good paper! Why is there no code related to the actual operation of the robot ?
  2. how to solve the following error ?I have tried to change os.environ["CUDA_VISIBLE_DEVICES"] = "0" (dream2real) user@user:~/code1/chk23/mulitimodal/metaphor/Dream2Real$ python demo.py dataset/shopping method_out/shopping configs/shopping_demo.json "put the apple inside the blue and white bowl" pybullet build time: Aug 26 2024 17:52:27 Running with config: configs/shopping_demo.json Hyperparameters read from the model weights: C^k=64, C^v=512, C^h=64 Single object mode: False XMem_inference initialized Overriding torch_dtype=None with torch_dtype=torch.float16 due to requirements of bitsandbytes to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning. bin /home/user/miniforge3/envs/dream2real/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cuda113.so Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:30<00:00, 15.09s/it] Building scene model... Loading RGBD data... 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:04<00:00, 8.42it/s] Generating dynamic masks for backgrond... Loading cached dynamic masks... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 34/34 [00:00<00:00, 180.49it/s]

23:31:14 SUCCESS Initialized CUDA 11.3. Active GPU is #2: NVIDIA GeForce RTX 4090 [89] 23:31:14 SUCCESS Detected auxiliary GPUs: 23:31:14 SUCCESS #0: NVIDIA GeForce RTX 4090 [89] 23:31:14 SUCCESS #1: NVIDIA GeForce RTX 4090 [89] 23:31:14 SUCCESS #3: NVIDIA GeForce RTX 4090 [89] 23:31:14 INFO Loading network snapshot from: method_out/shopping/fg_base.ingp 23:31:15 INFO GridEncoding: Nmin=16 b=2.20818 F=4 T=2^19 L=8 23:31:15 INFO Density model: 3--[HashGrid]-->32--[FullyFusedMLP(neurons=64,layers=3)]-->1 23:31:15 INFO Color model: 3--[Composite]-->16+16--[FullyFusedMLP(neurons=64,layers=4)]-->3 23:31:15 INFO total_encoding_params=12660928 total_network_params=10240 Using cached visual model for task background Using cached fg model for movable object 23:31:16 SUCCESS Initialized CUDA 11.3. Active GPU is #2: NVIDIA GeForce RTX 4090 [89] 23:31:16 SUCCESS Detected auxiliary GPUs: 23:31:16 SUCCESS #0: NVIDIA GeForce RTX 4090 [89] 23:31:16 SUCCESS #1: NVIDIA GeForce RTX 4090 [89] 23:31:16 SUCCESS #3: NVIDIA GeForce RTX 4090 [89] 23:31:16 INFO Loading network snapshot from: method_out/shopping/bg_base.ingp 23:31:17 INFO GridEncoding: Nmin=16 b=2.20818 F=4 T=2^19 L=8 23:31:17 INFO Density model: 3--[HashGrid]-->32--[FullyFusedMLP(neurons=64,layers=3)]-->1 23:31:17 INFO Color model: 3--[Composite]-->16+16--[FullyFusedMLP(neurons=64,layers=4)]-->3 23:31:17 INFO total_encoding_params=12660928 total_network_params=10240 23:31:14 INFO Loading network snapshot from: method_out/shopping/fg_base.ingp 23:31:15 INFO GridEncoding: Nmin=16 b=2.20818 F=4 T=2^19 L=8 23:31:15 INFO Density model: 3--[HashGrid]-->32--[FullyFusedMLP(neurons=64,layers=3)]-->1 23:31:15 INFO Color model: 3--[Composite]-->16+16--[FullyFusedMLP(neurons=64,layers=4)]-->3 23:31:15 INFO total_encoding_params=12660928 total_network_params=10240 Using cached visual model for task background Using cached fg model for movable object 23:31:16 SUCCESS Initialized CUDA 11.3. Active GPU is #2: NVIDIA GeForce RTX 4090 [89] 23:31:16 SUCCESS Detected auxiliary GPUs: 23:31:16 SUCCESS #0: NVIDIA GeForce RTX 4090 [89] 23:31:16 SUCCESS #1: NVIDIA GeForce RTX 4090 [89] 23:31:16 SUCCESS #3: NVIDIA GeForce RTX 4090 [89] 23:31:16 INFO Loading network snapshot from: method_out/shopping/bg_base.ingp 23:31:17 INFO GridEncoding: Nmin=16 b=2.20818 F=4 T=2^19 L=8 23:31:17 INFO Density model: 3--[HashGrid]-->32--[FullyFusedMLP(neurons=64,layers=3)]-->1 23:31:17 INFO Color model: 3--[Composite]-->16+16--[FullyFusedMLP(neurons=64,layers=4)]-->3 23:31:17 INFO total_encoding_params=12660928 total_network_params=10240 ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /home/user/code1/chk23/mulitimodal/metaphor/Dream2Real/demo.py:55 in │ │ │ │ 52 │ imagination = ImaginationEngine(cfg) │ │ 53 │ imagination.build_scene_model() │ │ 54 │ task_model = imagination.interpret_user_instr(user_instr, goal_caption=goal_caption, │ │ ❱ 55 │ movable_best_pose = imagination.dream_best_pose(task_model) │ │ 56 │ print(colored("Predicted pose for movable object:", "green")) │ │ 57 │ print(movable_best_pose) │ │ 58 │ │ │ │ /home/user/code1/chk23/mulitimodal/metaphor/Dream2Real/dream2real.py:379 in dream_best_pose │ │ │ │ 376 │ │ │ else: │ │ 377 │ │ │ │ movable_geom = task_model.movable_obj.phys_model │ │ 378 │ │ │ if not self.use_vis_pcds: │ │ ❱ 379 │ │ │ │ vis_cost_volume(pose_scores, self.sample_res, pose_batch, bground_geoms) │ │ 380 │ │ │ │ if not tsdf_vis: │ │ 381 │ │ │ │ │ vis_multiverse(pose_scores, self.sample_res, pose_batch, bground_geo │ │ 382 │ │ │ │ best_ori = best_pose.view(4, 4)[:3, :3].cpu().numpy() │ │ │ │ /home/user/code1/chk23/mulitimodal/metaphor/Dream2Real/vision_3d/geometry_utils.py:139 in │ │ vis_cost_volume │ │ │ │ 136 │ │ 137 def vis_cost_volume(pose_scores, sample_res, pose_batch, bground_geoms, use_phys_mods=Tr │ │ 138 │ # Normalise scores for visualisation. │ │ ❱ 139 │ pose_scores = pose_scores.clone().double() │ │ 140 │ nonzero_idxs = torch.nonzero(pose_scores, as_tuple=True) │ │ 141 │ zero_idxs = torch.nonzero(pose_scores == 0, as_tuple=True) │ │ 142 │ if exp: │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: CUDA error: invalid resource handle

ivan-kapelyukh commented 2 weeks ago

Hello, thank you for your interest in this paper!

Regarding Question 1: indeed, the output of this Dream2Real code is the goal pose for the object. You can then use this goal pose with your own robot system, e.g. as the goal state for motion planning, or for a goal-conditioned policy, or something else. In our robot demos, we use a low-level control library for our robot which is internal to Dyson, and which we do not currently have permission to release. Please accept our apologies for this. However, personally I recommend using cuRobo to compute the motion plan to reach the goal pose (https://curobo.org/). It is faster than the library we were using, and we are using it for some of our future work.

Regarding Question 2: it seems that the crash happens when simply cloning a tensor and converting it to a double float type. Therefore, this is very likely to be a torch / CUDA installation issue, rather than an issue with the Dream2Real code itself. Although we are not able to help much with debugging a specific CUDA installation, one thing we can suggest is updating from CUDA 11.3 to at least 11.7. Hopefully this will help.