octo-models / octo

Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
https://octo-models.github.io/
MIT License
885 stars 166 forks source link

There is a problem if i load the learned weight from another PC. #119

Open a510721 opened 4 months ago

a510721 commented 4 months ago

There is a problem if i load the learned weight from another PC.

The following errors occur when saving weights learned from local PC and loading weights from other PC

What should I do in case of errors like below?

I0715 17:08:55.144797 140510257677504 checkpointer.py:164] Restoring item from /home/kys/tensorflow_datasets/aloha_simul_dataset/checkpoint_another_pc/4999/default. Traceback (most recent call last): File "/home/kys/workspace/service_robot/octo/octo/examples/03_eval_finetuned.py", line 149, in app.run(main) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/absl/app.py", line 308, in run _run_main(main, args) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/absl/app.py", line 254, in _run_main sys.exit(main(argv)) File "/home/kys/workspace/service_robot/octo/octo/examples/03_eval_finetuned.py", line 53, in main model = OctoModel.load_pretrained(FLAGS.finetuned_path) File "/home/kys/workspace/service_robot/octo/octo/octo/model/octo_model.py", line 339, in load_pretrained params = checkpointer.restore(step, params_shape) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/checkpoint_manager.py", line 550, in restore restored_items = self._restore_impl( File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/checkpoint_manager.py", line 582, in _restore_impl restored[item_name] = self._checkpointers[item_name].restore( File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/checkpointer.py", line 165, in restore restored = self._restore_with_args(directory, *args, **kwargs) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/checkpointer.py", line 103, in _restore_with_args restored = self._handler.restore(directory, args=ckpt_args) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/pytree_checkpoint_handler.py", line 1020, in restore structure = self._get_internal_metadata(directory) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/pytree_checkpoint_handler.py", line 1218, in _get_internal_metadata aggregate_tree = self._read_aggregate_file(directory) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/pytree_checkpoint_handler.py", line 1099, in _read_aggregate_file return self._aggregate_handler.deserialize(checkpoint_path) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/aggregate_handlers.py", line 87, in deserialize return msgpack_utils.msgpack_restore(msgpack) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/msgpack_utils.py", line 232, in msgpack_restore state_dict = msgpack.unpackb( File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb ValueError: int is not allowed for map key when strict_map_key=True

If strict_map_key = False, the following error occurs.

I0715 17:21:15.503768 140078019884224 checkpointer.py:164] Restoring item from /home/kys/tensorflow_datasets/aloha_simul_dataset/checkpoint_another_pc/4999/default. Traceback (most recent call last): File "/home/kys/workspace/service_robot/octo/octo/examples/03_eval_finetuned.py", line 149, in app.run(main) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/absl/app.py", line 308, in run _run_main(main, args) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/absl/app.py", line 254, in _run_main sys.exit(main(argv)) File "/home/kys/workspace/service_robot/octo/octo/examples/03_eval_finetuned.py", line 53, in main model = OctoModel.load_pretrained(FLAGS.finetuned_path) File "/home/kys/workspace/service_robot/octo/octo/octo/model/octo_model.py", line 339, in load_pretrained params = checkpointer.restore(step, params_shape) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/checkpoint_manager.py", line 550, in restore restored_items = self._restore_impl( File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/checkpoint_manager.py", line 582, in _restore_impl restored[item_name] = self._checkpointers[item_name].restore( File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/checkpointer.py", line 165, in restore restored = self._restore_with_args(directory, *args, **kwargs) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/checkpointer.py", line 103, in _restore_with_args restored = self._handler.restore(directory, args=ckpt_args) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/pytree_checkpoint_handler.py", line 1020, in restore structure = self._get_internal_metadata(directory) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/pytree_checkpoint_handler.py", line 1218, in _get_internal_metadata aggregate_tree = self._read_aggregate_file(directory) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/pytree_checkpoint_handler.py", line 1099, in _read_aggregate_file return self._aggregate_handler.deserialize(checkpoint_path) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/aggregate_handlers.py", line 87, in deserialize return msgpack_utils.msgpack_restore(msgpack) File "/home/kys/anaconda3/envs/octo/lib/python3.10/site-packages/orbax/checkpoint/msgpack_utils.py", line 232, in msgpack_restore state_dict = msgpack.unpackb( File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb TypeError: unhashable type: 'numpy.ndarray'