However after summing up the world vectors and rotation deltas for the expert and pretrained model from gs://gdm-robotics-open-x-embodiment/open_x_embodiment_and_rt_x_oss/rt_1_x_tf_trained_for_0022724, it is clear that this pre-trained model is overshooting the workspace by up to two meters sometimes. The "rt1main" weights from Google Research also produces similar results (top row is the ground truth from the fractal dataset):
I believe I am using tf_agents as in the colab demo above. What am I doing wrong?
I am doing something like:
I followed the padding procedure in Minimal_example_for_running_inference_using_RT_1_X_TF_using_tensorflow_datasets.ipynb and am using the same sentence encoder "https://tfhub.dev/google/universal-sentence-encoder-large/5".
However after summing up the world vectors and rotation deltas for the expert and pretrained model from
gs://gdm-robotics-open-x-embodiment/open_x_embodiment_and_rt_x_oss/rt_1_x_tf_trained_for_0022724
, it is clear that this pre-trained model is overshooting the workspace by up to two meters sometimes. The "rt1main" weights from Google Research also produces similar results (top row is the ground truth from the fractal dataset):I believe I am using tf_agents as in the colab demo above. What am I doing wrong? I am doing something like:
for each inference call with the returned policy state. (You can see the exact code I am running which is this method)
Am I missing something?