DLR-RM / stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
https://stable-baselines3.readthedocs.io
MIT License
9.08k stars 1.7k forks source link

Python 3.8: "an integer is required (got type bytes)" when loading models saved under older Python versions #171

Closed ManifoldFR closed 4 years ago

ManifoldFR commented 4 years ago

Describe the bug

There is an issue loading trained agent files in Python 3.8.

Code example

I checked this on the rl-baselines3-zoo repository.

(base) manifold$ python enjoy.py --algo sac --env Pendulum-v0
Loading latest experiment, id=1
Traceback (most recent call last):
  File "enjoy.py", line 225, in <module>
    main()
  File "enjoy.py", line 141, in main
    model = ALGOS[algo].load(model_path, env=env, **kwargs)
  File "/home/manifold/stable-baselines3/stable_baselines3/common/base_class.py", line 561, in load
    data, params, pytorch_variables = load_from_zip_file(path, device=device)
  File "/home/manifold/stable-baselines3/stable_baselines3/common/save_util.py", line 390, in load_from_zip_file
    data = json_to_data(json_data)
  File "/home/manifold/stable-baselines3/stable_baselines3/common/save_util.py", line 164, in json_to_data
    deserialized_object = cloudpickle.loads(base64_object)
TypeError: an integer is required (got type bytes)

System Info Describe the characteristic of your environment:

Additional context

I printed out info and saw that it occurs when loading the data key learning_rate on the SAC example above. For policies trained using recent versions of SB3 it happens for lr_schedule.

ManifoldFR commented 4 years ago

After further investigation, this also applies to clip_range in PPO.

I think the common point is that these can all be callables, right? They were also created using common.utils.constant_fn from what I see in the bytestring.

Miffyli commented 4 years ago

Does the same happen with saving/loading models with stable-baselines3 alone without the files in zoo? The code for saving and loading has changed since those zoo models were trained so it might have broken a thing or two.

ManifoldFR commented 4 years ago

It happens for models trained using a version of SB3 pulled from Git a week ago, albeit with Python 3.7

Miffyli commented 4 years ago

Just to clarify: You train a model with older version of SB3 (week ago), save it, update to current SB3 (master) and loading fails?

araffin commented 4 years ago

Just to clarify: You train a model with older version of SB3 (week ago), save it, update to current SB3 (master) and loading fails?

the issue comes with the pretrained models (using python 3.6) that are available in the zoo.

araffin commented 4 years ago

seems like an issue coming from cloudpickle + python 3.8 https://stackoverflow.com/questions/60267477/typeerror-an-integer-is-required-got-type-bytes-when-importing-pyspark-on-p?noredirect=1&lq=1

Miffyli commented 4 years ago

Hmm either rl-zoo should provide parameters of networks which can be loaded (which then skips training parameters...) or update loading code to catch errors like this and then ask for user to provide that object as part of the custom objects (that override the loaded values), and then rl-zoo provides these values.

ManifoldFR commented 4 years ago

Rolling back to commit 00595b0 was not sufficient to get the rl-zoo examples working again. I think it's indeed a problem with pickling in 3.8. AFAIK it might not be the protocol version because pickle/cloudpickle knows to use the right protocol version given a file or byte sequence.

Miffyli commented 4 years ago

Given that pickle/cloudpickle are not meant for long term usage (the explicitly state that it should not be used for such purpose), I think this kind of errors are expected. Moving completely away from pickling does not make much sense since models can have custom agents, but I think we need to add something to make this kind of situations easier to handle (e.g. like providing the invalid values manually).

ManifoldFR commented 4 years ago

I found that it comes from this PR way, way upstream: https://github.com/python/cpython/pull/12701, referenced here. It breaks the constructor of code objects, which include functions defined from functions and saved using cloudpickle.

Seeing this I agree having a nice fallback would be more elegant until things are sorted out upstream.

pablogsal commented 4 years ago

Seeing this I agree having a nice fallback would be more elegant until things are sorted out upstream.

:hand: Hi, CPython maintainer here. Unfortunately, there is nothing to be done in our side, as the signature of code objects is not considered stable between versions. The constructor is purpously not even documented and we have this advice here:

If you instantiate any of these types, note that signatures may vary between Python versions.

(https://docs.python.org/3/library/types.html#standard-interpreter-types)

You can also use this new convenience method that we added to make the transition easier for people using code objects:

https://docs.python.org/3/library/types.html#types.CodeType.replace

ManifoldFR commented 4 years ago

Thanks for your input, I'll see about contributing a fallback for our specific problem and a more general fix for the cloudpickle dependency, though their loading code seems actually be pickle.load and this isn't really recommended/supported usage of the library.

araffin commented 4 years ago

closing this in favor of #172 (and as it is not an issue coming from SB3)