matthieu637 / ddrl

Deep Developmental Reinforcement Learning
MIT License
29 stars 3 forks source link

np.ascontiguousarray error #6

Closed huangjiancong1 closed 5 years ago

huangjiancong1 commented 5 years ago

When I use FetchReach's observation to training use penfac, I miss the error below:

(clustering) jim@jim-Inspiron-7577:~/ddrl/gym $ python run_robot.py 
{'load': None, 'save_best': False, 'config': 'config_fetch.ini', 'capture': False, 'test_only': False, 'render': False, 'view': False}
State space: [ 1.34184371e+00  7.49100477e-01  5.34717228e-01  1.89027457e-04
  7.77191143e-05  3.43749435e-06 -1.26100357e-08 -9.04671899e-08
  4.55387076e-06 -2.13287826e-06]
Action space: Box(4,)
- low: [-1. -1. -1. -1.]
- high: [1. 1. 1. 1.]
Create agent with (nb_motors, nb_sensors) :  4 10
main algo : PeNFAC(lambda)-V
episode 0 total steps 0 last perf 0
Traceback (most recent call last):
  File "run_robot.py", line 203, in <module>
    sample_step = train(env, ag, episode)
  File "run_robot.py", line 135, in train
    _, tr, _, _sample_steps = run_episode(env, ag, True, episode)
  File "run_robot.py", line 79, in run_episode
    action = ag.run(reward, observation, learning, False, False)
  File "/home/jim/ddrl/gym/agent.py", line 40, in run
    return lib.OfflineCaclaAg_run(self.obj, reward, np.ascontiguousarray(state, np.float64), learning, goal , last)
  File "/home/jim/anaconda2/envs/clustering/lib/python3.5/site-packages/numpy/core/numeric.py", line 632, in ascontiguousarray
    return array(a, dtype, copy=False, order='C', ndmin=1)
TypeError: float() argument must be a string or a number, not 'dict'

But it seems that I can calculate when in debugging:

(clustering) jim@jim-Inspiron-7577:~/ddrl/gym $ python run_robot.py 
{'render': False, 'capture': False, 'view': False, 'save_best': False, 'load': None, 'config': 'config_fetch.ini', 'test_only': False}
State space: [ 1.34184371e+00  7.49100477e-01  5.34717228e-01  1.89027457e-04
  7.77191143e-05  3.43749435e-06 -1.26100357e-08 -9.04671899e-08
  4.55387076e-06 -2.13287826e-06]
Action space: Box(4,)
- low: [-1. -1. -1. -1.]
- high: [1. 1. 1. 1.]
Create agent with (nb_motors, nb_sensors) :  4 10
main algo : PeNFAC(lambda)-V
episode 0 total steps 0 last perf 0
> /home/jim/ddrl/gym/agent.py(40)run()
-> return lib.OfflineCaclaAg_run(self.obj, reward, np.ascontiguousarray(state, np.float64), learning, goal , last)
(Pdb) l
 35                     lib.OfflineCaclaAg_unique_invoke(self.obj, len(argv), select)
 36                     lib.OfflineCaclaAg_run.restype = ndpointer(ctypes.c_double, shape=(nb_motors,))
 37     
 38                 def run(self, reward, state, learning, goal, last):
 39                     pdb.set_trace()
 40  ->                 return lib.OfflineCaclaAg_run(self.obj, reward, np.ascontiguousarray(state, np.float64), learning, goal , last)
 41     
 42                 def start_ep(self, state, learning):
 43                     lib.OfflineCaclaAg_start_episode(self.obj, np.ascontiguousarray(state, np.float64), learning)
 44     
 45                 def end_ep(self, learning):
(Pdb) p lib.OfflineCaclaAg_run(self.obj, reward, np.ascontiguousarray(state, np.float64), learning, goal , last)
array([-0.23061481, -0.06233114,  0.51097662, -0.33029073])
(Pdb) s
--Call--
> /home/jim/anaconda2/envs/clustering/lib/python3.5/site-packages/numpy/core/numeric.py(594)ascontiguousarray()
-> @set_module('numpy')
(Pdb) l
589     
590         """
591         return array(a, dtype, copy=False, order=order, subok=True)
592     
593     
594  -> @set_module('numpy')
595     def ascontiguousarray(a, dtype=None):
596         """
597         Return a contiguous array (ndim >= 1) in memory (C order).
598     
599         Parameters
(Pdb) n
> /home/jim/anaconda2/envs/clustering/lib/python3.5/site-packages/numpy/core/numeric.py(632)ascontiguousarray()
-> return array(a, dtype, copy=False, order='C', ndmin=1)
(Pdb) l
627     
628         Note: This function returns an array with at least one-dimension (1-d)
629         so it will not preserve 0-d arrays.
630     
631         """
632  ->     return array(a, dtype, copy=False, order='C', ndmin=1)
633     
634     
635     @set_module('numpy')
636     def asfortranarray(a, dtype=None):
637         """
(Pdb) p array(a, dtype, copy=False, order='C', ndmin=1)
array([ 1.34184371e+00,  7.49100477e-01,  5.34717228e-01,  1.89027457e-04,
        7.77191143e-05,  3.43749435e-06, -1.26100357e-08, -9.04671899e-08,
        4.55387076e-06, -2.13287826e-06])
(Pdb) 
matthieu637 commented 5 years ago

As mention in this blog you need to transform the dict into a flat array https://openai.com/blog/ingredients-for-robotics-research/.

I pushed a new commit to automatically do it 80c4f6669605e6ce5b876a3219fbb470e8f6b1e1. With this version you only needs to change the env_name inside config.ini.