when "file_name + .zip" exists, should "model.n_steps" be ep_length // 8, as small as not exists?

PWhiddy / PokemonRedExperiments

Playing Pokemon Red with Reinforcement Learning

MIT License

6.99k stars 644 forks source link

when "file_name + .zip" exists, should "model.n_steps" be ep_length // 8, as small as not exists? #148

Closed Garbage123King closed 6 months ago

Garbage123King commented 11 months ago

with the default setting num_cpu= 16 , I ran out of my 40G RAM and process was killed by system.

sudo cat /var/log/syslog | grep -i "killed"

kernel: [ 1522.255350] Out of memory: Killed process 15384 (python) total-vm:54111924kB, anon-rss:35495356kB, file-rss:72320kB, shmem-rss:14336kB, UID:0 pgtables:76780kB oom_score_adj:0

Garbage123King commented 11 months ago

I just found that, if I start with a new folder, then I will use less memory, because it began training every 2.5k steps. But if I start with a old exists folder, then I will use 50+ GB memory at the last traning moment. It start training every 20480 steps.

Garbage123King commented 11 months ago

file_name = 'session_e41c9eff/poke_38207488_steps'

if exists(file_name + '.zip'):
    print('\nloading checkpoint')
    model = PPO.load(file_name, env=env)
    model.n_steps = ep_length      #should this be ep_length // 8 ? Or it is on purpose?
    model.n_envs = num_cpu
    model.rollout_buffer.buffer_size = ep_length
    model.rollout_buffer.n_envs = num_cpu
    model.rollout_buffer.reset()
else:
    model = PPO('CnnPolicy', env, verbose=1, n_steps=ep_length // 8, batch_size=128, n_epochs=3, gamma=0.998, tensorboard_log=sess_path)