rail-berkeley / serl

SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning
https://serl-robot.github.io/
MIT License
312 stars 27 forks source link

Load checkpoint error #40

Closed DigitalRowlet closed 4 months ago

DigitalRowlet commented 5 months ago

Hi! I meet a problems on loading checkpoints.

    ckpt = checkpoints.restore_checkpoint(
  File "/home/wk/anaconda3/envs/copy2/lib/python3.10/site-packages/flax/training/checkpoints.py", line 1129, in restore_checkpoint
    restored = orbax_checkpointer.restore(
  File "/home/wk/anaconda3/envs/copy2/lib/python3.10/site-packages/orbax/checkpoint/checkpointer.py", line 165, in restore
    restored = self._restore_with_args(directory, *args, **kwargs)
  File "/home/wk/anaconda3/envs/copy2/lib/python3.10/site-packages/orbax/checkpoint/checkpointer.py", line 103, in _restore_with_args
    restored = self._handler.restore(directory, args=ckpt_args)
  File "/home/wk/anaconda3/envs/copy2/lib/python3.10/site-packages/orbax/checkpoint/pytree_checkpoint_handler.py", line 1019, in restore
    byte_limiter = get_byte_limiter(self._concurrent_gb)
  File "/home/wk/anaconda3/envs/copy2/lib/python3.10/site-packages/orbax/checkpoint/pytree_checkpoint_handler.py", line 169, in get_byte_limiter
    return asyncio.run(_create_byte_limiter())
  File "/home/wk/anaconda3/envs/copy2/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/home/wk/anaconda3/envs/copy2/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete
    return future.result()
  File "/home/wk/anaconda3/envs/copy2/lib/python3.10/site-packages/orbax/checkpoint/pytree_checkpoint_handler.py", line 167, in _create_byte_limiter
    return LimitInFlightBytes(concurrent_bytes)  # pylint: disable=protected-access
  File "/home/wk/anaconda3/envs/copy2/lib/python3.10/site-packages/jax/experimental/array_serialization/serialization.py", line 144, in __init__
    self._cv = asyncio.Condition(lock=asyncio.Lock())
  File "/home/wk/anaconda3/envs/copy2/lib/python3.10/asyncio/locks.py", line 234, in __init__
    raise ValueError("loop argument must agree with lock")
ValueError: loop argument must agree with lock

I have found similar issue in the Orbax project,It seems like conflicts between different package versions , but the solution to this issue is to switch the Python version to 3.11, not the project's 3.10. Could you provide the specific version numbers of the packages that need to be installed in this project?

I sincerely look forward to your advice and guidance.

DigitalRowlet commented 4 months ago

I thought I may solve this problem,just switch the python version from 3.10.0 to 3.10.12.

conda create -n serl python=3.10.12