ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
34.05k stars 5.78k forks source link

SAC Checkpoint Loading Error #42651

Open pietromosca1994 opened 9 months ago

pietromosca1994 commented 9 months ago

I am training a SAC agent for sub-optimally interacting with a custom Gymnasium environment. During the training I am successfully saving periodic checkpoints, 1 every logical training operation. I am then trying to load the agent from a certain checkpoint and I am having the following error

c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\algorithms\algorithm.py:484: RayDeprecationWarning: This API is deprecated and may be removed in future Ray releases. You could suppress this warning by setting env variable PYTHONWARNINGS="ignore::DeprecationWarning"
`UnifiedLogger` will be removed in Ray 2.7.
  return UnifiedLogger(config, logdir, loggers=None)
c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\tune\logger\unified.py:53: RayDeprecationWarning: This API is deprecated and may be removed in future Ray releases. You could suppress this warning by setting env variable PYTHONWARNINGS="ignore::DeprecationWarning"
The `JsonLogger interface is deprecated in favor of the `ray.tune.json.JsonLoggerCallback` interface and will be removed in Ray 2.7.
  self._loggers.append(cls(self.config, self.logdir, self.trial))
c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\tune\logger\unified.py:53: RayDeprecationWarning: This API is deprecated and may be removed in future Ray releases. You could suppress this warning by setting env variable PYTHONWARNINGS="ignore::DeprecationWarning"
The `CSVLogger interface is deprecated in favor of the `ray.tune.csv.CSVLoggerCallback` interface and will be removed in Ray 2.7.
  self._loggers.append(cls(self.config, self.logdir, self.trial))
c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\tune\logger\unified.py:53: RayDeprecationWarning: This API is deprecated and may be removed in future Ray releases. You could suppress this warning by setting env variable PYTHONWARNINGS="ignore::DeprecationWarning"
The `TBXLogger interface is deprecated in favor of the `ray.tune.tensorboardx.TBXLoggerCallback` interface and will be removed in Ray 2.7.
  self._loggers.append(cls(self.config, self.logdir, self.trial))
(RolloutWorker pid=22716) 2024-01-09 14:14:14,174   WARNING env.py:162 -- Your env doesn't have a .spec.max_episode_steps attribute. Your horizon will default to infinity, and your environment will not be reset.
(RolloutWorker pid=22716) 2024-01-09 14:14:16,219   WARNING deprecation.py:50 -- DeprecationWarning: `ray.rllib.models.tf.fcnet.FullyConnectedNetwork` has been deprecated. This will raise an error in the future!
(RolloutWorker pid=22716) 2024-01-09 14:14:16,226   WARNING deprecation.py:50 -- DeprecationWarning: `ray.rllib.models.tf.misc.normc_initializer` has been deprecated. This will raise an error in the future!
(RolloutWorker pid=22716) c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\algorithms\sac\sac_tf_model.py:106: RuntimeWarning: divide by zero encountered in log
(RolloutWorker pid=22716)   np.log(initial_alpha), dtype=tf.float32, name="log_alpha"
c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\algorithms\sac\sac_tf_model.py:106: RuntimeWarning: divide by zero encountered in log
  np.log(initial_alpha), dtype=tf.float32, name="log_alpha"
2024-01-09 14:14:34,691 INFO trainable.py:164 -- Trainable.setup took 42.758 seconds. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2024-01-09 14:14:34,695 WARNING util.py:62 -- Install gputil for GPU system monitoring.
2024-01-09 14:14:44,168 ERROR actor_manager.py:500 -- Ray error, taking actor 1 out of service. ray::RolloutWorker.apply() (pid=22716, ip=127.0.0.1, actor_id=f9aa83d32adf799d0a04564801000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x0000016A98EF6340>)
  File "python\ray\_raylet.pyx", line 1675, in ray._raylet.execute_task
  File "python\ray\_raylet.pyx", line 1615, in ray._raylet.execute_task.function_executor
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\_private\function_manager.py", line 726, in actor_method_executor
    return method(__ray_actor, *args, **kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\util\tracing\tracing_helper.py", line 467, in _resume_span
    return method(self, *_args, **_kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\utils\actor_manager.py", line 185, in apply
    raise e
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\utils\actor_manager.py", line 176, in apply
    return func(self, *args, **kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\algorithms\algorithm.py", line 2609, in <lambda>
    lambda w: w.set_state(ray.get(remote_state)),
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\util\tracing\tracing_helper.py", line 467, in _resume_span
    return method(self, *_args, **_kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\evaluation\rollout_worker.py", line 1454, in set_state
    self.policy_map[pid].set_state(policy_state)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\policy\eager_tf_policy.py", line 783, in set_state
    self.global_timestep.assign(state["global_timestep"])
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 925, in assign
    value_tensor = ops.convert_to_tensor(value, dtype=self.dtype)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\profiler\trace.py", line 183, in wrapped
    return func(*args, **kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\framework\ops.py", line 1638, in convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\framework\constant_op.py", line 343, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\framework\constant_op.py", line 267, in constant
    return _constant_impl(value, dtype, shape, name, verify_shape=False,
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\framework\constant_op.py", line 279, in _constant_impl
    return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\framework\constant_op.py", line 304, in _constant_eager_impl
    t = convert_to_eager_tensor(value, ctx, dtype)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\framework\constant_op.py", line 102, in convert_to_eager_tensor
    return ops.EagerTensor(value, ctx.device_name, dtype)
ValueError: Attempt to convert a value ({b'nd': False, b'type': '<i8', b'data': b'$^\x00\x00\x00\x00\x00\x00'}) with an unsupported type (<class 'dict'>) to a Tensor.
2024-01-09 14:14:44,170 ERROR actor_manager.py:500 -- Ray error, taking actor 2 out of service. ray::RolloutWorker.apply() (pid=11108, ip=127.0.0.1, actor_id=f5a1187e2dfffff486a5756b01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x000001C798EE7310>)
  File "python\ray\_raylet.pyx", line 1675, in ray._raylet.execute_task
  File "python\ray\_raylet.pyx", line 1615, in ray._raylet.execute_task.function_executor
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\_private\function_manager.py", line 726, in actor_method_executor
    return method(__ray_actor, *args, **kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\util\tracing\tracing_helper.py", line 467, in _resume_span
    return method(self, *_args, **_kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\utils\actor_manager.py", line 185, in apply
    raise e
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\utils\actor_manager.py", line 176, in apply
    return func(self, *args, **kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\algorithms\algorithm.py", line 2609, in <lambda>
    lambda w: w.set_state(ray.get(remote_state)),
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\util\tracing\tracing_helper.py", line 467, in _resume_span
    return method(self, *_args, **_kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\evaluation\rollout_worker.py", line 1454, in set_state
    self.policy_map[pid].set_state(policy_state)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\policy\eager_tf_policy.py", line 783, in set_state
    self.global_timestep.assign(state["global_timestep"])
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 925, in assign
    value_tensor = ops.convert_to_tensor(value, dtype=self.dtype)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\profiler\trace.py", line 183, in wrapped
    return func(*args, **kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\framework\ops.py", line 1638, in convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\framework\constant_op.py", line 343, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\framework\constant_op.py", line 267, in constant
    return _constant_impl(value, dtype, shape, name, verify_shape=False,
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\framework\constant_op.py", line 279, in _constant_impl
    return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\framework\constant_op.py", line 304, in _constant_eager_impl
    t = convert_to_eager_tensor(value, ctx, dtype)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\tensorflow\python\framework\constant_op.py", line 102, in convert_to_eager_tensor
    return ops.EagerTensor(value, ctx.device_name, dtype)
ValueError: Attempt to convert a value ({b'nd': False, b'type': '<i8', b'data': b'$^\x00\x00\x00\x00\x00\x00'}) with an unsupported type (<class 'dict'>) to a Tensor.
2024-01-09 14:14:44,171 [ERROR] Failed to load algo converting to msgpack from C:/Users/spepe/Documents/RL/evaluations/20240104-1523SAC_NEW_RF_aftertuningCustomEnv/checkpoints/00240
ray::RolloutWorker.apply() (pid=22716, ip=127.0.0.1, actor_id=f9aa83d32adf799d0a04564801000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x0000016A98EF6340>)
  File "python\ray\_raylet.pyx", line 1675, in ray._raylet.execute_task
  File "python\ray\_raylet.pyx", line 1615, in ray._raylet.execute_task.function_executor
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\_private\function_manager.py", line 726, in actor_method_executor
    return method(__ray_actor, *args, **kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\util\tracing\tracing_helper.py", line 467, in _resume_span
    return method(self, *_args, **_kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\utils\actor_manager.py", line 185, in apply
    raise e
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\utils\actor_manager.py", line 176, in apply
    return func(self, *args, **kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\algorithms\algorithm.py", line 2609, in <lambda>
    lambda w: w.set_state(ray.get(remote_state)),
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\util\tracing\tracing_helper.py", line 467, in _resume_span
    return method(self, *_args, **_kwargs)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\evaluation\rollout_worker.py", line 1454, in set_state
    self.policy_map[pid].set_state(policy_state)
  File "c:\Users\spepe\anaconda3\envs\RayGPU_sac\lib\site-packages\ray\rllib\policy\eager_tf_policy.py", line 783, in set_st...
'''
The error is 'ValueError: Attempt to convert a value ({b'nd': False, b'type': '<i8', b'data': b'$^\x00\x00\x00\x00\x00\x00'}) with an unsupported type (<class 'dict'>) to a Tensor.'  
Looks like something related to the serialization of the agent.

Does anyone have an idea on how to solve it?

### Versions / Dependencies

absl-py @ file:///C:/b/abs_d3cv5rzljl/croot/absl-py_1686852506854/work
aiohttp @ file:///C:/b/abs_bc6tmjiy12/croot/aiohttp_1701112585940/work
aiosignal @ file:///tmp/build/80754af9/aiosignal_1637843061372/work
anyio==4.2.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
array-record==0.4.1
arrow==1.3.0
astor==0.8.1
asttokens @ file:///home/conda/feedstock_root/build_artifacts/asttokens_1698341106958/work
astunparse==1.6.3
async-lru==2.0.4
async-timeout @ file:///C:/b/abs_c8fgiuixkq/croot/async-timeout_1703097556097/work
attrs @ file:///C:/b/abs_35n0jusce8/croot/attrs_1695717880170/work
Babel==2.14.0
beautifulsoup4==4.12.3
bleach==6.1.0
blinker @ file:///C:/b/abs_d9y2dm7cw2/croot/blinker_1696539752170/work
Bottleneck @ file:///C:/Windows/Temp/abs_3198ca53-903d-42fd-87b4-03e6d03a8381yfwsuve8/croots/recipe/bottleneck_1657175565403/work
Brotli @ file:///C:/Windows/Temp/abs_63l7912z0e/croots/recipe/brotli-split_1659616056886/work
cachetools @ file:///tmp/build/80754af9/cachetools_1619597386817/work
certifi==2023.11.17
cffi @ file:///C:/b/abs_924gv1kxzj/croot/cffi_1700254355075/work
charset-normalizer==3.3.2
click @ file:///C:/b/abs_f9ihnt72pu/croot/click_1698129847492/work
cloudpickle==3.0.0
colorama @ file:///home/conda/feedstock_root/build_artifacts/colorama_1666700638685/work
comm @ file:///home/conda/feedstock_root/build_artifacts/comm_1704278392174/work
contourpy==1.2.0
cryptography @ file:///C:/b/abs_f4do8t8jfs/croot/cryptography_1694444424531/work
cycler==0.12.1
debugpy @ file:///D:/bld/debugpy_1695534507849/work
decorator==5.1.1
defusedxml==0.7.1
dm-tree==0.1.8
etils==1.5.2
exceptiongroup @ file:///home/conda/feedstock_root/build_artifacts/exceptiongroup_1704921103267/work
executing @ file:///home/conda/feedstock_root/build_artifacts/executing_1698579936712/work
Farama-Notifications==0.0.4
fastjsonschema==2.19.1
filelock==3.13.1
flatbuffers==23.5.26
fonttools==4.47.2
fqdn==1.5.1
frozenlist @ file:///C:/b/abs_d8e__s1ys3/croot/frozenlist_1698702612014/work
fsspec==2023.5.0
gast==0.4.0
google-auth @ file:///C:/b/abs_defnokp9xd/croot/google-auth_1694152741394/work
google-auth-oauthlib==1.2.0
google-pasta==0.2.0
googleapis-common-protos==1.62.0
grpcio @ file:///C:/ci/grpcio_1637590978642/work
gym==0.26.2
gym-notices==0.0.8
gymnasium==0.28.1
h5py==3.10.0
idna==3.6
imageio==2.33.1
importlib-metadata==7.0.0
importlib-resources==6.1.1
ipykernel @ file:///D:/bld/ipykernel_1705418162861/work
ipython @ file:///D:/bld/ipython_1701831845989/work
ipython-genutils==0.2.0
ipywidgets==8.1.1
isoduration==20.11.0
jax-jumpy==1.0.0
jedi @ file:///home/conda/feedstock_root/build_artifacts/jedi_1696326070614/work
Jinja2==3.1.3
joblib==1.3.2
json5==0.9.14
jsonpointer==2.4
jsonschema==4.21.0
jsonschema-specifications==2023.12.1
jupyter-events==0.9.0
jupyter-lsp==2.2.2
jupyter_client @ file:///home/conda/feedstock_root/build_artifacts/jupyter_client_1699283905679/work
jupyter_core @ file:///D:/bld/jupyter_core_1704727196845/work
jupyter_server==2.12.5
jupyter_server_terminals==0.5.1
jupyterlab==4.0.11
jupyterlab-widgets==3.0.9
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.2
keras==2.15.0
Keras-Preprocessing==1.1.2
kiwisolver==1.4.5
lazy_loader==0.3
libclang==16.0.6
lz4==4.3.2
Markdown==3.5.1
MarkupSafe==2.1.3
matplotlib==3.7.4
matplotlib-inline @ file:///home/conda/feedstock_root/build_artifacts/matplotlib-inline_1660814786464/work
mistune==3.0.2
mkl-fft @ file:///C:/b/abs_19i1y8ykas/croot/mkl_fft_1695058226480/work
mkl-random @ file:///C:/b/abs_edwkj1_o69/croot/mkl_random_1695059866750/work
mkl-service==2.4.0
ml-dtypes==0.2.0
mpmath==1.3.0
msgpack==1.0.7
msgpack-numpy==0.4.8
multidict @ file:///C:/b/abs_44ido987fv/croot/multidict_1701097803486/work
nbclient==0.9.0
nbconvert==7.14.2
nbformat==5.9.2
nest_asyncio @ file:///home/conda/feedstock_root/build_artifacts/nest-asyncio_1705352640985/work
networkx==3.2.1
notebook==7.0.7
notebook_shim==0.2.3
numexpr @ file:///C:/b/abs_5fucrty5dc/croot/numexpr_1696515448831/work
numpy==1.26.2
oauthlib==3.2.2
opt-einsum==3.3.0
optimized_charge @ file:///C:/Users/spepe/Documents/Cloned_repo/VW%20Glassdollar/vw-optimized_charge
overrides==7.4.0
packaging==23.2
pandas==1.5.3
pandocfilters==1.5.1
parso @ file:///home/conda/feedstock_root/build_artifacts/parso_1638334955874/work
pickleshare @ file:///home/conda/feedstock_root/build_artifacts/pickleshare_1602536217715/work
pillow==10.2.0
platformdirs @ file:///home/conda/feedstock_root/build_artifacts/platformdirs_1701708255999/work
plotly==5.18.0
prometheus-client==0.19.0
promise==2.3
prompt-toolkit @ file:///home/conda/feedstock_root/build_artifacts/prompt-toolkit_1702399386289/work
protobuf==4.23.4
psutil @ file:///D:/bld/psutil_1702833254596/work
pure-eval @ file:///home/conda/feedstock_root/build_artifacts/pure_eval_1642875951954/work
pyaml==23.12.0
pyarrow==12.0.1
pyasn1==0.5.1
pyasn1-modules==0.3.0
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
Pygments @ file:///home/conda/feedstock_root/build_artifacts/pygments_1700607939962/work
PyJWT @ file:///C:/ci/pyjwt_1657511236979/work
pyOpenSSL @ file:///C:/b/abs_08f38zyck4/croot/pyopenssl_1690225407403/work
pyparsing==3.1.1
PySocks @ file:///C:/ci/pysocks_1605307512533/work
python-dateutil @ file:///home/conda/feedstock_root/build_artifacts/python-dateutil_1626286286081/work
python-json-logger==2.0.7
pytz @ file:///C:/b/abs_19q3ljkez4/croot/pytz_1695131651401/work
PyWavelets==1.5.0
pywin32==306
pywinpty==2.0.12
PyYAML==6.0.1
pyzmq @ file:///D:/bld/pyzmq_1701783320990/work
ray==2.9.0
referencing==0.32.1
requests==2.31.0
requests-oauthlib==1.3.1
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py==0.17.1
rsa==4.9
scikit-image==0.21.0
scikit-learn==1.0.2
scikit-optimize==0.9.0
scipy==1.10.1
Send2Trash==1.8.2
six==1.16.0
sniffio==1.3.0
soupsieve==2.5
stack-data @ file:///home/conda/feedstock_root/build_artifacts/stack_data_1669632077133/work
sympy==1.12
tenacity==8.2.3
tensorboard==2.15.1
tensorboard-data-server==0.7.2
tensorboard-plugin-wit==1.8.1
tensorboardX==2.6
tensorflow==2.15.0
tensorflow-datasets==4.9.0
tensorflow-estimator==2.15.0
tensorflow-intel==2.15.0
tensorflow-io-gcs-filesystem==0.31.0
tensorflow-metadata==1.13.0
tensorflow-probability==0.23.0
termcolor==2.4.0
terminado==0.18.0
threadpoolctl==3.2.0
tifffile==2023.12.9
tinycss2==1.2.1
toml==0.10.2
tomli==2.0.1
torch==2.0.1
torchdata==0.6.1
torchmetrics==0.10.3
torchtext==0.15.2
torchvision==0.15.2
tornado @ file:///D:/bld/tornado_1695373623388/work
tqdm==4.64.1
traitlets @ file:///home/conda/feedstock_root/build_artifacts/traitlets_1704212992681/work
types-python-dateutil==2.8.19.20240106
typing_extensions==4.8.0
tzdata @ file:///croot/python-tzdata_1690578112552/work
uri-template==1.3.0
urllib3 @ file:///C:/b/abs_9cmlsrm3ys/croot/urllib3_1698257595508/work
wcwidth @ file:///home/conda/feedstock_root/build_artifacts/wcwidth_1704731205417/work
webcolors==1.13
webencodings==0.5.1
websocket-client==1.7.0
Werkzeug==3.0.1
widgetsnbextension==4.0.9
win-inet-pton @ file:///C:/ci/win_inet_pton_1605306162074/work
wrapt==1.14.1
yarl @ file:///C:/b/abs_8bxwdyhjvp/croot/yarl_1701105248152/work
zipp==3.17.0

### Reproduction script

I am loading the agent with the following code:
def load_algo_from_checkpoint(self, checkpoint_dir: str):
    ''' Function to reload a checkpoint from a training
    WARNING: Algorithm checkpoints created are always cloud-pickle based. There is therefore no guarantee that an algorithm
             saved with python 3.8 will be possible to be restored with python 3.9
    https://docs.ray.io/en/latest/rllib/rllib-saving-and-loading-algos-and-policies.html
    ''' 
    try: 
        with tempfile.TemporaryDirectory() as msgpack_cp_dir:
            convert_to_msgpack_checkpoint(checkpoint_dir, msgpack_cp_dir)

        # Try recreating a new algorithm object from the msgpack checkpoint.
        # Note: `Algorithm.from_checkpoint` now works with both pickle AND msgpack
        # type checkpoints.
        self.algo = Algorithm.from_checkpoint(msgpack_cp_dir)
        self.logger.info(f'Successfully loaded algo from msgpack from {checkpoint_dir}')
        self.print_algo_info()
    except Exception as e:  
        self.logger.error(f'Failed to load algo converting to msgpack from {checkpoint_dir}\n{e}')
        self.logger.info('Trying loading from pickle')
        try:
            self.algo = Algorithm.from_checkpoint(checkpoint_dir)
            self.print_algo_info()
            self.load_env()
            self.load_policy()
            self.logger.info(f'Successfully loaded algo from pickle from {checkpoint_dir}')
        except Exception as e:
            self.logger.error(f'Failed to load algo from {checkpoint_dir}\n{e}')


### Issue Severity

None
simonsays1980 commented 7 months ago

@pietromosca1994 We need a reproducable example to investigate this further