bthananjeyan / saved-rl

Implementation of safety augmented value estimation from demonstrations (SAVED)
https://sites.google.com/view/safetyaugmentedvalueestimation/
MIT License
22 stars 7 forks source link

ValueError: Must provide a value_target function. #1

Open dogged1021 opened 3 years ago

dogged1021 commented 3 years ago

Hi, I am a new learner in Imitation Learning and very interested in the the density model of safe states. First, I ran <scripts/mbexp.py>with arguments <-env reachersparse>, and got the <model.mat>, <model.nns> and other files. However, when I ran <scripts/render.py>with arguments <-env reachersparse -model-dir ./log/2021-7-13 -logdir ./log>, there was an error :

Traceback (most recent call last):
  File "/home/frankchen/code/mujoco/saved-rl/scripts/render.py", line 66, in <module>
    main(args.env, "MPC", args.ctrl_arg, args.override, args.model_dir, args.logdir)
  File "/home/frankchen/code/mujoco/saved-rl/scripts/render.py", line 48, in main
    exp = MBExperiment(cfg.exp_cfg)
  File "/home/frankchen/code/mujoco/saved-rl/dmbrl/misc/MBExp.py", line 141, in __init__
    self.target = get_required_argument(params.exp_cfg, "value_target", "Must provide a value_target function.")
  File "/home/frankchen/code/mujoco/saved-rl/dmbrl/misc/DotmapUtils.py", line 9, in get_required_argument
    raise ValueError(message)
ValueError: Must provide a value_target function.  // In fact, the original note is ''ValueError: Must provide a value function." as 
                                                                                //  line 140 in  `<dmbrl/misc/MBExp.py>` , maybe it was a typing mistake.
Created an MPC controller, prop mode TSinf, 20 particles. 
Controller is logging particle predictions (Note: This may be memory-intensive).

Process finished with exit code 1

I don't know how and where to provide the <value_target>, and where the <vaule_target> either. Could you please help me to figure this problem? Thanks a lot !!!

AND another question is that how to train the initial density model of demonstrations? I cannot find the code ... I'd appreciate it you could tell me about that !!!

dogged1021 commented 3 years ago

And here is my config

{'ctrl_cfg': {'alpha_thresh': 15,
              'beta_thresh': None,
              'env': <dmbrl.env.reachersparse.ReacherSparse3DEnv object at 0x7f5e8286fb90>,
              'gym_robotics': False,
              'has_constraints': False,
              'log_cfg': {},
              'opt_cfg': {'ac_cost_fn': <function ReacherSparseConfigModule.ac_cost_fn at 0x7f5e82876e60>,
                          'cfg': {'alpha': 0.1,
                                  'max_iters': 5,
                                  'num_elites': 40,
                                  'popsize': 400},
                          'mode': 'CEM',
                          'obs_cost_fn': <bound method ReacherSparseConfigModule.obs_cost_fn of <reachersparse.ReacherSparseConfigModule object at 0x7f5e8286f850>>,
                          'plan_hor': 25},
              'prop_cfg': {'mode': 'TSinf',
                           'model_init_cfg': {'model_class': <class 'dmbrl.modeling.models.BNN.BNN'>,
                                              'model_constructor': <bound method ReacherSparseConfigModule.nn_constructor of <reachersparse.ReacherSparseConfigModule object at 0x7f5e8286f850>>,
                                              'num_nets': 5},
                           'model_train_cfg': {'epochs': 5},
                           'npart': 20,
                           'obs_postproc': <function ReacherSparseConfigModule.obs_postproc at 0x7f5e82876c20>,
                           'targ_proc': <function ReacherSparseConfigModule.targ_proc at 0x7f5e82876cb0>},
              'target_value_func': <dmbrl.values.Value.DeepValueFunction object at 0x7f5e8286f650>,
              'update_fns': [<bound method ReacherSparseConfigModule.update_goal of <reachersparse.ReacherSparseConfigModule object at 0x7f5e8286f850>>],
              'use_value': True,
              'value_func': <dmbrl.values.Value.DeepValueFunction object at 0x7f5e8286f790>},
 'exp_cfg': {'exp_cfg': {'demo_high_cost': 300,
                         'demo_load_path': '/home/frankchen/code/mujoco/saved-rl/experts/reachersparse/expert4/logs.mat',
                         'demo_low_cost': 70,
                         'gym_robotics': False,
                         'load_samples': True,
                         'nrollouts_per_iter': 1,
                         'ntrain_iters': 100,
                         'num_demos': 20,
                         'policy': <dmbrl.controllers.MPC.MPC object at 0x7f5e7c3b8090>,
                         'ss_buffer_size': 20000,
                         'use_value': True,
                         'value': <dmbrl.values.Value.DeepValueFunction object at 0x7f5e8286f790>,
                         'value_target': <dmbrl.values.Value.DeepValueFunction object at 0x7f5e8286f650>},
             'log_cfg': {'logdir': 'log', 'nrecord': 0},
             'sim_cfg': {'env': <dmbrl.env.reachersparse.ReacherSparse3DEnv object at 0x7f5e8286fb90>,
                         'task_hor': 100}},
 'val_cfg': {'env': <dmbrl.env.reachersparse.ReacherSparse3DEnv object at 0x7f5e8286fb90>,
             'gym_robotics': False,
             'log_cfg': {},
             'model_init_cfg_val': {'model_class': <class 'dmbrl.modeling.models.BNN.BNN'>,
                                    'model_constructor': <bound method ReacherSparseConfigModule.value_nn_constructor of <reachersparse.ReacherSparseConfigModule object at 0x7f5e8286f850>>,
                                    'num_nets': 5},
             'model_train_cfg': {'epochs': 5},
             'obs_postproc': <function ReacherSparseConfigModule.obs_postproc at 0x7f5e82876c20>,
             'opt_cfg': {},
             'prop_cfg': {},
             'targ_proc': <function ReacherSparseConfigModule.targ_proc at 0x7f5e82876cb0>,
             'update_fns': [<bound method ReacherSparseConfigModule.update_goal of <reachersparse.ReacherSparseConfigModule object at 0x7f5e8286f850>>],
             'val_buffer_size': 1000}}