rll / rllab

rllab is a framework for developing and evaluating reinforcement learning algorithms, fully compatible with OpenAI Gym.
Other
2.91k stars 800 forks source link

Can't pickle custom environment with nested class #219

Open ffnc1020 opened 6 years ago

ffnc1020 commented 6 years ago

Hi, I am new to RL, new to rllab, and new to python. I am trying to train a policy to do continuous control for a custom environment.

I implement the environment in my_sim_env.py according to this, and I can run the script fine. https://rllab.readthedocs.io/en/latest/user/implement_env.html

But when I try the pickled mode because I want to save the results and parameters, it gives attribute error after one iteration like this:

Traceback (most recent call last):
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 268, in _getattribute
    obj = getattr(obj, subpath)
AttributeError: module 'pydart2.gui.trackball' has no attribute 'c_float_Array_16'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 907, in save_global
    obj2, parent = _getattribute(module, name)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 271, in _getattribute
    .format(name, obj))
AttributeError: Can't get attribute 'c_float_Array_16' on <module 'pydart2.gui.trackball' from '/Users/guy/Workspace/rllab/pydart2/pydart2/gui/trackball.py'>

During handling of the above exception, another exception occurred:
...
more error...
...
File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 911, in save_global
    (obj, module_name, name))
_pickle.PicklingError: Can't pickle <class 'pydart2.gui.trackball.c_float_Array_16'>: it's not found as pydart2.gui.trackball.c_float_Array_16

My code looks like this

from rllab.algos.trpo import TRPO
from rllab.baselines.linear_feature_baseline import LinearFeatureBaseline
from rllab.envs.normalized_env import normalize
from rllab.misc.instrument import stub, run_experiment_lite
from rllab.policies.gaussian_mlp_policy import GaussianMLPPolicy
import sys

from my_sim_env import MySimEnv

stub(globals())
env = normalize(MySimEnv())
policy = GaussianMLPPolicy(
    env_spec=env.spec,
    hidden_sizes=(32, 32)

)
baseline = LinearFeatureBaseline(env_spec=env.spec)

algo = TRPO(
    env=env,
    policy=policy,
    baseline=baseline,
    batch_size=4000,
    max_path_length=100,
    n_itr=5,
    discount=0.99,
    step_size=0.01,
    # Uncomment both lines (this and the plot parameter below) to enable plotting
    # plot=True,
)

run_experiment_lite(
    algo.train(),
    # Number of parallel workers for sampling
    n_parallel=1,
    # Only keep the snapshot parameters for the last iteration
    snapshot_mode="last",
    # Specifies the seed for the experiment. If this is not provided, a random seed
    # will be used
    seed=1,
    # plot=True,
)

InMySimEnv I created a custom class instance which uses pydart.

Is there a way to make this work without refactoring? Or am I missing something obvious?

ffnc1020 commented 6 years ago

If I try the new pickled mode like this

from rllab.algos.trpo import TRPO
from rllab.baselines.linear_feature_baseline import LinearFeatureBaseline
from rllab.envs.normalized_env import normalize
from rllab.misc.instrument import stub, run_experiment_lite
from rllab.policies.gaussian_mlp_policy import GaussianMLPPolicy
import sys

from my_sim_env import MySimEnv

#stub(globals())

def run_task(*_):
    env = normalize(MySimEnv())
    policy = GaussianMLPPolicy(
        env_spec=env.spec,
        hidden_sizes=(32, 32)

    )
    baseline = LinearFeatureBaseline(env_spec=env.spec)

    algo = TRPO(
        env=env,
        policy=policy,
        baseline=baseline,
        batch_size=4000,
        max_path_length=100,
        n_itr=5,
        discount=0.99,
        step_size=0.01,
        # Uncomment both lines (this and the plot parameter below) to enable plotting
        # plot=True,
    )
    algo.train()

run_experiment_lite(
    run_task,
    # Number of parallel workers for sampling
    n_parallel=1,
    # Only keep the snapshot parameters for the last iteration
    snapshot_mode="last",
    # Specifies the seed for the experiment. If this is not provided, a random seed
    # will be used
    seed=1,
    # plot=True,
)

It generate similar error after the first iteration.

Traceback (most recent call last):
  File "/Users/guy/Workspace/rllab/scripts/run_experiment_lite.py", line 137, in <module>
    run_experiment(sys.argv)
  File "/Users/guy/Workspace/rllab/scripts/run_experiment_lite.py", line 121, in run_experiment
    method_call(variant_data)
  File "trpo_fwmav.py", line 33, in run_task
    algo.train()
  File "/Users/guy/Workspace/rllab/rllab/algos/batch_polopt.py", line 130, in train
    logger.save_itr_params(itr, params)
  File "/Users/guy/Workspace/rllab/rllab/misc/logger.py", line 224, in save_itr_params
    joblib.dump(params, file_name, compress=3)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 480, in dump
    NumpyPickler(f, protocol=protocol).dump(value)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 408, in dump
    self.save(obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 836, in _batch_setitems
    save(v)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 623, in save_reduce
    save(state)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 836, in _batch_setitems
    save(v)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 623, in save_reduce
    save(state)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 836, in _batch_setitems
    save(v)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 740, in save_tuple
    save(element)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 623, in save_reduce
    save(state)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 841, in _batch_setitems
    save(v)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 623, in save_reduce
    save(state)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 836, in _batch_setitems
    save(v)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 623, in save_reduce
    save(state)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 836, in _batch_setitems
    save(v)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 623, in save_reduce
    save(state)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 841, in _batch_setitems
    save(v)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/site-packages/joblib/numpy_pickle.py", line 280, in save
    return Pickler.save(self, obj)
  File "/usr/local/miniconda3/envs/rllab3/lib/python3.5/pickle.py", line 495, in save
    rv = reduce(self.proto)
TypeError: can't pickle SwigPyObject objects

Is there anyway to save parameters and log without using pickled mode?

drao2 commented 6 years ago

I'm finding similar issues as well - I'm even happy to eradicate pickling / checkpointing if that's possible, but no luck so far.

Did you make any progress on the above?

ffnc1020 commented 6 years ago

I commented out the logging part in the algorithm code for example line 131: logger.save_itr_params(itr, params) in rllab/algos/batch_polopt.py and implemented logging myself.