dgriff777 / rl_a3c_pytorch

A3C LSTM Atari with Pytorch plus A3G design
Apache License 2.0
563 stars 119 forks source link

I just want to say your trained model has no effect #5

Closed lucasjinreal closed 7 years ago

lucasjinreal commented 7 years ago

I try to eval your trained model, however the result has no effect:

2017-08-01 21:08:13,757 : reward sum: -21.0, reward mean: -21.0000
[2017-08-01 21:08:13,757] reward sum: -21.0, reward mean: -21.0000
[2017-08-01 21:08:13,787] Starting new video recorder writing to /Volumes/xs/CodeSpace/AISpace/rl_space/rl_a3c_pytorch/Pong-v0_monitor/openaigym.video.0.33472.video000001.mp4
2017-08-01 21:08:24,947 : reward sum: -21.0, reward mean: -21.0000
[2017-08-01 21:08:24,947] reward sum: -21.0, reward mean: -21.0000
2017-08-01 21:08:35,054 : reward sum: -21.0, reward mean: -21.0000
[2017-08-01 21:08:35,054] reward sum: -21.0, reward mean: -21.0000
2017-08-01 21:08:44,732 : reward sum: -21.0, reward mean: -21.0000
[2017-08-01 21:08:44,732] reward sum: -21.0, reward mean: -21.0000

And the record is white-and-black videos, can not just show on screen.

dgriff777 commented 7 years ago

what os you running this on? I just ran it works fine for me

[2017-08-01 11:30:30,111] Making new env: Pong-v0
[2017-08-01 11:30:30,336] Clearing 6 monitor files from previous run (because force=True was provided)
[2017-08-01 11:30:30,345] Starting new video recorder writing to /Users/dgriffis/rl_a3c_pytorch/Pong-v0_monitor/openaigym.video.0.38559.video000000.mp4
2017-08-01 11:30:42,785 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:30:42,785] reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:30:42,804] Starting new video recorder writing to /Users/dgriffis/rl_a3c_pytorch/Pong-v0_monitor/openaigym.video.0.38559.video000001.mp4
2017-08-01 11:30:55,304 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:30:55,304] reward sum: 21.0, reward mean: 21.0000
2017-08-01 11:31:07,255 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:31:07,255] reward sum: 21.0, reward mean: 21.0000
2017-08-01 11:31:19,209 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:31:19,209] reward sum: 21.0, reward mean: 21.0000
2017-08-01 11:31:31,044 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:31:31,044] reward sum: 21.0, reward mean: 21.0000
2017-08-01 11:31:43,474 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:31:43,474] reward sum: 21.0, reward mean: 21.0000
2017-08-01 11:31:55,597 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:31:55,597] reward sum: 21.0, reward mean: 21.0000
2017-08-01 11:32:07,620 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:32:07,620] reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:32:07,628] Starting new video recorder writing to /Users/dgriffis/rl_a3c_pytorch/Pong-v0_monitor/openaigym.video.0.38559.video000008.mp4
2017-08-01 11:32:20,379 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:32:20,379] reward sum: 21.0, reward mean: 21.0000
lucasjinreal commented 7 years ago

I am running macOS, why am I just got -21 all the time. Which command are you using?

lucasjinreal commented 7 years ago
➜  rl_a3c_pytorch git:(master) ✗ python gym_eval.py --env Pong-v0 --num-episodes 100
[2017-08-02 09:06:09,852] Making new env: Pong-v0
[2017-08-02 09:06:10,107] Clearing 6 monitor files from previous run (because force=True was provided)
[2017-08-02 09:06:10,145] Starting new video recorder writing to /Volumes/xs/CodeSpace/AISpace/rl_space/rl_a3c_pytorch/Pong-v0_monitor/openaigym.video.0.35879.video000000.mp4
2017-08-02 09:06:20,499 : reward sum: -21.0, reward mean: -21.0000
[2017-08-02 09:06:20,499] reward sum: -21.0, reward mean: -21.0000
[2017-08-02 09:06:20,529] Starting new video recorder writing to /Volumes/xs/CodeSpace/AISpace/rl_space/rl_a3c_pytorch/Pong-v0_monitor/openaigym.video.0.35879.video000001.mp4
2017-08-02 09:06:30,942 : reward sum: -21.0, reward mean: -21.0000
[2017-08-02 09:06:30,942] reward sum: -21.0, reward mean: -21.0000

I trained whole night but when I cut it, nothing saved, can not find any model saved.....

dgriff777 commented 7 years ago

Well first I would update repo cause I tinkered a lot with it past couple days but I know it's working fine now.. are you seeing the models in the trained_models folder?

dgriff777 commented 7 years ago

Oh this your trained model? Are you seeing a saved model in the folder or and models? Should be a Pong-v0.dat file

lucasjinreal commented 7 years ago

Yeah, I seen it, but seems this model is you have trained already in your repo, cause besides Pong there are other moels. Anyway, how should I exactly call my model and the render env at mean time to see AI play?

dgriff777 commented 7 years ago

Well I have set up up so models save in trained models folder and load there. If you want to watch gym_eval you have to do

Python gym_eval.py --env Pong-v0 --num-episodes 100 --render True

lucasjinreal commented 7 years ago

wechatimg5724 Well, got this old-fasioned black-white screen, and the result still -21, is the model didn't update?

dgriff777 commented 7 years ago

That looks like you having dependencies issues with gym

Can u go in terminal: start python Type:

Import gym Import cv2

env=gym.make('Pong-v0') frame=env.reset() cv2.imshow('tt', frame) cv2.waitKey(0)

Let me know what you see from that..

lucasjinreal commented 7 years ago

wechatimg5725 Well, weired, I also have gym on python3, it work totally fine. on Python2.7 it shows like this, no matter using cv2 or just env.render(), should I update this code to python3?

lucasjinreal commented 7 years ago

It has problem in save model, I update main.py default saved dir to trained_models_me, when I cut it, there has no my dir created.

dgriff777 commented 7 years ago

You have to create directory first if not using folder trained_models. I did not set up to create saved folder directories

dgriff777 commented 7 years ago

Yeah try that same code in python3 and see if pic of Atari screen comes up

lucasjinreal commented 7 years ago

Thanks dgriff, you are a master in reinforcement learning.

dgriff777 commented 7 years ago

It's working now?! You welcome. Happy to help😄

lucasjinreal commented 7 years ago

Yeah, really thanks your help Pal.

dgriff777 commented 7 years ago

Awesome! Have fun :thumbsup:

lucasjinreal commented 7 years ago

Hi, dgriff, sorry for the bother but I have one last question, in train.py I can't find codes to save model, I am new to pytorch, is there a way to store weights into specific dir and load it when run again?

dgriff777 commented 7 years ago

When running training command you can do:

python main.py --env Pong-v0 --workers 32 --save-dir 'example_folder/'

And to load from specific folder:

python main.py --env Pong-v0 --workers 32  --load-dir 'example_folder/' --load True

Can also specify both in command

dgriff777 commented 7 years ago

Loading code in training is in main.py

    if args.load:
        saved_state = torch.load(
            '{0}{1}.dat'.format(args.load_model_dir, args.env))

Saving model code is in test.py

            if reward_sum > args.save_score_level:
                state_to_save = player.model.state_dict()
                torch.save(state_to_save, '{0}{1}.dat'.format(
                    args.save_model_dir, args.env))

And load model code in gym_eval.py

saved_state = torch.load(
    '{0}{1}.dat'.format(args.load_model_dir, args.env),
    map_location=lambda storage, loc: storage)