StanfordVL / GibsonEnv

Gibson Environments: Real-World Perception for Embodied Agents
http://gibsonenv.stanford.edu/
MIT License
871 stars 146 forks source link

The issue of enjoy_husky_navigate after it is trained #99

Open Berk035 opened 5 years ago

Berk035 commented 5 years ago

Hello everyone,

I study on navigate examples of Gibson. I am trying to reach target position and something is missing. There is a wrong with enjoy_husky_navigate_ppo1.py while I am trying to apply my trained model. The error is shown below:

Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/home/deepsrv/anaconda3/envs/py35/lib/python3.5/site-packages/gym/utils/closer.py", line 67, in close closeable.close() File "/home/deepsrv/PycharmProjects/Gibson_Env/gibson/envs/env_modalities.py", line 490, in _close self.r_camera_rgb._close() AttributeError: 'NoneType' object has no attribute '_close'

Also I met with this error at fuse_policy function. My model consists 3000 timesteps per actorbatch. How can handle with this error?

Thanks in advance.

Berk035 commented 5 years ago

I fixed the problem which is above. It is occurred due to 'out of memory' for CUDA. I closed some programs and it runs properly.

Unfortunately, I have a different problem with running model on enjoy_husky_navigate. I met with the error this time:

killing <subprocess.Popen object at 0x7fe1439ed8d0> File "/home/deepsrv/PycharmProjects/Gibson_Env/examples/train/enjoy_husky_navigate_ppo1.py", line 101, in <module> File "/home/deepsrv/PycharmProjects/Gibson_Env/examples/train/enjoy_husky_navigate_ppo1.py", line 88, in main File "/home/deepsrv/PycharmProjects/Gibson_Env/examples/train/enjoy_husky_navigate_ppo1.py", line 73, in train File "/home/deepsrv/PycharmProjects/Gibson_Env/gibson/utils/pposgd_simple.py", line 378, in enjoy File "/home/deepsrv/anaconda3/envs/py35/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1560, in restore File "/home/deepsrv/anaconda3/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 895, in run File "/home/deepsrv/anaconda3/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1124, in _run File "/home/deepsrv/anaconda3/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run File "/home/deepsrv/anaconda3/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call NotFoundError: Key pi/polfc1/kernel not found in checkpoint [[Node: save/RestoreV2_19 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_19/tensor_names, save/RestoreV2_19/shape_and_slices)]]

What is wrong this time?

Berk035 commented 5 years ago

I realized that training mode is different. So, it causes this error. But, I want to ask another thing. How can I continue to training process with trained model file? Is the code this one: if reload name: saver = tf.train.Saver() saver.restore(tf.get_default_session(), reload name)

Thanks in advance.