Closed houyaokun closed 10 months ago
The actions of the robotic arm seem strange in some tasks, and I suspect that it may be an issue with GCBC.The robotic arm has even moved outside the field of view.
Sometimes the robotic arm will move at peculiar angles. However, it performs relatively better on the task of pulling the handle to open the drawer.
During the evaluation, the following messages appeared. Are these messages the cause of the problem?
2023-11-24 14:33:49.090058: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0
.
2023-11-24 14:33:49.113779: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-11-24 14:33:50.157969: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-11-24 14:33:50.173970: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-11-24 14:33:50.174105: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
pybullet build time: Nov 9 2023 10:51:31
Global seed set to 0
Initialized persistent compilation cache at /home/zs/.jax_compilation_cache
The config attributes {'pretrained': 'instruct-pix2pix'} were passed to FlaxUNet2DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
WARNING:absl:FlaxUNet2DConditionModel unused kwargs: {'pretrained': 'instruct-pix2pix'}
/home/zs/anaconda3/envs/susie/lib/python3.10/site-packages/diffusers/configuration_utils.py:217: FutureWarning: It is deprecated to pass a pretrained model name or path to from_config
.
deprecate("config-passed-as-path", "1.0.0", deprecation_message, standard_warn=False)
WARNING:absl:Extra kwargs passed to CalvinDataset: {'num_devices': 1, 'prefetch_num_batches': 20}
2023-11-24 14:34:05.136133: I tensorflow/core/grappler/optimizers/data/replicate_on_split.cc:32] Running replicate on split optimization
2023-11-24 14:34:05.541715: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:693] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: -14 } dim { size: -15 } dim { size: -16 } dim { size: -17 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 2112 num_cores: 20 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 1310720 l3_cache_size: 26214400 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: -44 } dim { size: -45 } dim { size: -17 } } }
2023-11-24 14:34:05.541775: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:693] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: -18 } dim { size: -19 } dim { size: -20 } dim { size: -21 } } } inputs { dtype: DT_FLOAT shape { dim { size: -3 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -3 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 2112 num_cores: 20 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 1310720 l3_cache_size: 26214400 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -3 } dim { size: -46 } dim { size: -47 } dim { size: -21 } } }
2023-11-24 14:34:05.541811: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:693] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: -22 } dim { size: -23 } dim { size: -24 } dim { size: -25 } } } inputs { dtype: DT_FLOAT shape { dim { size: -4 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -4 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 2112 num_cores: 20 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 1310720 l3_cache_size: 26214400 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -4 } dim { size: -48 } dim { size: -49 } dim { size: -25 } } }
2023-11-24 14:34:05.542088: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:693] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: -32 } dim { size: -33 } dim { size: -34 } dim { size: -35 } } } inputs { dtype: DT_FLOAT shape { dim { size: -8 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -8 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 2112 num_cores: 20 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 1310720 l3_cache_size: 26214400 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -8 } dim { size: -53 } dim { size: -54 } dim { size: -35 } } }
2023-11-24 14:34:05.542120: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:693] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: -36 } dim { size: -37 } dim { size: -38 } dim { size: -39 } } } inputs { dtype: DT_FLOAT shape { dim { size: -10 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -10 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 2112 num_cores: 20 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 1310720 l3_cache_size: 26214400 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -10 } dim { size: -55 } dim { size: -56 } dim { size: -39 } } }
2023-11-24 14:34:05.542142: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:693] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: -40 } dim { size: -41 } dim { size: -42 } dim { size: -43 } } } inputs { dtype: DT_FLOAT shape { dim { size: -12 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -12 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 2112 num_cores: 20 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 1310720 l3_cache_size: 26214400 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -12 } dim { size: -57 } dim { size: -58 } dim { size: -43 } } }
Loading checkpoint...
Checkpoint successfully loaded
argv[0]=
startThreads creating 1 threads.
starting thread 0
started thread 0
argc=3
argv[0] = --unused
argv[1] =
argv[2] = --start_demo_name=Physics Server
ExampleBrowserThreadFunc started
X11 functions dynamically loaded using dlopen/dlsym OK!
X11 functions dynamically loaded using dlopen/dlsym OK!
Creating context
Created GL 3.3 context
Direct GLX rendering context obtained
Making context current
GL_VENDOR=Intel
GL_RENDERER=Mesa Intel(R) Graphics (ADL-S GT1)
GL_VERSION=4.6 (Core Profile) Mesa 21.2.6
GL_SHADING_LANGUAGE_VERSION=4.60
pthread_getconcurrency()=0
Version = 4.6 (Core Profile) Mesa 21.2.6
Vendor = Intel
Renderer = Mesa Intel(R) Graphics (ADL-S GT1)
b3Printf: Selected demo: Physics Server
startThreads creating 1 threads.
starting thread 0
started thread 0
MotionThreadFunc thread started
ven = Intel
Workaround for some crash in the Intel OpenGL driver on Linux/Ubuntu
ven = Intel
Workaround for some crash in the Intel OpenGL driver on Linux/Ubuntu
/home/zs/anaconda3/envs/susie/lib/python3.10/site-packages/urdfpy/urdf.py:2169: RuntimeWarning: invalid value encountered in divide
value = value / np.linalg.norm(value)
logging to /tmp/evaluation
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:59<00:00, 1.00it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:53<00:00, 1.02it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:50<00:00, 1.03it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:55<00:00, 1.01it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:56<00:00, 1.01it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:56<00:00, 1.01it/s]
13%|██████████████▌ | 47/360 [00:57<06:25, 1.23s/it]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:56<00:00, 1.01it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:55<00:00, 1.01it/s]
22%|████████████████████████▌ | 79/360 [01:19<04:41, 1.00s/it]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:57<00:00, 1.01it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:56<00:00, 1.01it/s]
25%|████████████████████████████▎ | 91/360 [01:37<04:49, 1.08s/it]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:57<00:00, 1.01it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:57<00:00, 1.01it/s]
23%|██████████████████████████▏ | 84/360 [01:37<05:19, 1.16s/it]
54%|███████████████████████████████████████████████████████████▌ | 193/360 [03:17<02:51, 1.03s/it]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:55<00:00, 1.01it/s]
23%|██████████████████████████▏ | 84/360 [01:36<05:16, 1.15s/it]
32%|████████████████████████████████████ | 117/360 [01:57<04:03, 1.00s/it]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:56<00:00, 1.01it/s]
1/5 : 33.3% | 2/5 : 13.3% | 3/5 : 0.0% | 4/5 : 0.0% | 5/5 : 0.0% ||: 100%|█████████████████████████████████████████| 15/15 [1:41:24<00:00, 405.63s/it]
Results for Epoch 0:████████████████████████████████████████████████████████████████████████████████████████████████| 360/360 [05:55<00:00, 7.00it/s]
Average successful sequence length: 0.4666666666666667
Success rates for i instructions in a row:
1: 33.3%
2: 13.3%
3: 0.0%
4: 0.0%
5: 0.0%
turn_on_led: 2 / 2 | SR: 100.0%
open_drawer: 4 / 4 | SR: 100.0%
turn_on_lightbulb: 1 / 1 | SR: 100.0%
push_blue_block_right: 0 / 1 | SR: 0.0%
rotate_blue_block_right: 0 / 1 | SR: 0.0%
lift_blue_block_slider: 0 / 1 | SR: 0.0%
lift_blue_block_table: 0 / 1 | SR: 0.0%
push_pink_block_left: 0 / 2 | SR: 0.0%
move_slider_left: 0 / 3 | SR: 0.0%
push_blue_block_left: 0 / 2 | SR: 0.0%
lift_red_block_slider: 0 / 1 | SR: 0.0%
push_red_block_left: 0 / 1 | SR: 0.0%
rotate_red_block_left: 0 / 1 | SR: 0.0%
lift_red_block_table: 0 / 1 | SR: 0.0%
Best model: epoch 0 with average sequences length of 0.4666666666666667
And my Python version is 3.10, which differs from the requested version of 3.8. But I also tried testing in python 3.8, and the results were equally bad.My tensorflow is 2.13.0, and jax is 0.4.11.
Because I have insufficient VRAM, I set TensorFlow to use the CPU and Jax to use the GPU. Will this have an impact on the results?
When NUM_EVAL_SEQUENCES=150: Average successful sequence length: 0.22 Success rates for i instructions in a row: 1: 19.3% 2: 2.7% 3: 0.0% 4: 0.0% 5: 0.0% turn_on_led: 10 / 10 | SR: 100.0% rotate_red_block_right: 2 / 5 | SR: 40.0% open_drawer: 11 / 21 | SR: 52.4% push_red_block_left: 2 / 11 | SR: 18.2% move_slider_right: 1 / 14 | SR: 7.1% turn_off_led: 4 / 8 | SR: 50.0% push_blue_block_right: 1 / 5 | SR: 20.0% move_slider_left: 1 / 13 | SR: 7.7% rotate_blue_block_left: 1 / 3 | SR: 33.3% close_drawer: 0 / 6 | SR: 0.0% lift_pink_block_table: 0 / 7 | SR: 0.0% turn_on_lightbulb: 0 / 8 | SR: 0.0% push_blue_block_left: 0 / 8 | SR: 0.0% rotate_red_block_left: 0 / 3 | SR: 0.0% turn_off_lightbulb: 0 / 5 | SR: 0.0% push_pink_block_right: 0 / 4 | SR: 0.0% lift_pink_block_slider: 0 / 2 | SR: 0.0% lift_red_block_table: 0 / 9 | SR: 0.0% rotate_pink_block_right: 0 / 6 | SR: 0.0% lift_blue_block_table: 0 / 5 | SR: 0.0% lift_blue_block_slider: 0 / 5 | SR: 0.0% lift_red_block_slider: 0 / 5 | SR: 0.0% push_pink_block_left: 0 / 6 | SR: 0.0% rotate_pink_block_left: 0 / 2 | SR: 0.0% push_red_block_right: 0 / 3 | SR: 0.0% rotate_blue_block_right: 0 / 3 | SR: 0.0% push_into_drawer: 0 / 6 | SR: 0.0%
Best model: epoch 0 with average sequences length of 0.4
Hi @houyaokun
did you find the problem?
I download the diffusion model and goal conditioned policy checkpoints from https://huggingface.co/patreya/susie-calvin-checkpoints and set the values of the environment variables in eval_susie.sh, but the result is not good : Average successful sequence length: 0.4666666666666667 Success rates for i instructions in a row: 1: 33.3% 2: 13.3% 3: 0.0% 4: 0.0% 5: 0.0% turn_on_led: 2 / 2 | SR: 100.0% open_drawer: 4 / 4 | SR: 100.0% turn_on_lightbulb: 1 / 1 | SR: 100.0% push_blue_block_right: 0 / 1 | SR: 0.0% rotate_blue_block_right: 0 / 1 | SR: 0.0% lift_blue_block_slider: 0 / 1 | SR: 0.0% lift_blue_block_table: 0 / 1 | SR: 0.0% push_pink_block_left: 0 / 2 | SR: 0.0% move_slider_left: 0 / 3 | SR: 0.0% push_blue_block_left: 0 / 2 | SR: 0.0% lift_red_block_slider: 0 / 1 | SR: 0.0% push_red_block_left: 0 / 1 | SR: 0.0% rotate_red_block_left: 0 / 1 | SR: 0.0% lift_red_block_table: 0 / 1 | SR: 0.0% What could be the possible issues? Thank you for your time.