yaserkl / RLSeq2Seq

Deep Reinforcement Learning For Sequence to Sequence Models
https://arxiv.org/abs/1805.09461
MIT License
767 stars 160 forks source link

Replay Buffer isnt Loaded Enough Yet #38

Open Fatman003 opened 4 years ago

Fatman003 commented 4 years ago

While running the Actor-Crtic Experiment, "the Pre-Training Critic with fixed Actor", the program stops expectedly after saying the replay buffer isnt loaded enough yet.

The error code is actually this: _W0618 13:04:57.966608 140700510074624 replay_buffer.py:156] Bucket input queue is empty when calling next_batch. Bucket queue size: 0, Input queue size: 0 I0618 13:04:57.967000 140700510074624 runsummarization.py:481] replay buffer not loaded enough yet... ^C I have had a look at the summarization file and online but I really dont know how to fix this. I noticed it also occurs during the Training with true Q estimates. @yaserkl Can you please help me out?

khoaiha12 commented 4 years ago

@Fatman003 i have had same error ^C ^C

Fatman003 commented 4 years ago

@khoaiha12 I think it might have to do with the GPU allocation. try editing the command to use only one GPU(which is available) gpu_num=0..

khoaiha12 commented 4 years ago

@Fatman003 I tried it, but that error still occured. I run on GG Colab, only 1 GPU per session.

khoaiha12 commented 4 years ago

_INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Saving checkpoint to path ./src/logs/actor-critic-ddqn2/train/model.ckpt INFO:tensorflow:global_step/sec: 0 INFO:tensorflow:Starting standard services. INFO:tensorflow:Starting queue runners. INFO:tensorflow:Saving checkpoint to path ./src/logs/actor-critic-ddqn2/dqn/train/model.ckpt INFO:tensorflow:current_relay_network/global_step/sec: 0 INFO:tensorflow:Preparing or waiting for session... INFO:tensorflow:Created session. INFO:tensorflow:Starting run_training INFO:tensorflow:Starting DQN training thread... WARNING:tensorflow:Bucket input queue is empty when calling next_batch. Bucket queue size: 0, Input queue size: 0 INFO:tensorflow:replay buffer not loaded enough yet... INFO:tensorflow:Starting Seq2Seq training... INFO:tensorflow:Saving checkpoint to path ./src/logs/actor-critic-ddqn2/train/model.ckpt INFO:tensorflow:global_step/sec: 0 INFO:tensorflow:seconds for dqn collection: 19.2806758881 INFO:tensorflow:Q-values collection time: 43.3507909775 ReplayBatch size: 1546 ReplayBatch example queue size: 1 ReplayBatch batch queue size: 0 INFO:tensorflow:RUNNNING DQN PRETRAIN: Adding data to relplay buffer only... INFO:tensorflow:Saving checkpoint to path ./src/logs/actor-critic-ddqn2/dqn/train/model.ckpt WARNING:tensorflow:Bucket input queue is empty when calling nextbatch. Bucket queue size: 0, Input queue size: 0 INFO:tensorflow:replay buffer not loaded enough yet... INFO:tensorflow:seconds for dqn collection: 23.9194500446 INFO:tensorflow:Q-values collection time: 25.8006739616 ^C

This is the log when i trained with true Q estimates.

Fatman003 commented 4 years ago

Did you change your GPU allocation? I didn't have this issue when I changed the gpu_num to 0 and used my school cluster. It retrained but it might be an issue with Colab.