hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
http://stable-baselines.readthedocs.io/
MIT License
4.14k stars 723 forks source link

TensorboardWriter keeps files open #501

Open iinc opened 5 years ago

iinc commented 5 years ago

I am training an A2C agent and I want to frequently save the model.

The issue I am having is that too many tensorboard files are being opened and never closed. This causes the program to crash as it hits the limit set by my OS.

The A2C agent sets up a TensorboardWriter using a with statement. I would expect the resource would be cleaned up after the with statement is exited.

https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/a2c/a2c.py#L225

But in the __exit__ method, the tensorboard file is not closed. https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/base_class.py#L1056

from stable_baselines import A2C

model = A2C(...)
while True:
    model.learn(total_timesteps=1000, reset_num_timesteps=False, tb_log_name="train")
    model.save(os.path.join("models", str(model.num_timesteps)))
Miffyli commented 5 years ago

A cleaner way to store mode periodically is to use callbacks, much like in this example. This way you avoid recreating many things (including the TensorboardWriter). Granted, this issue could be fixed and document should highlight the designed way of saving models periodically.

ronaldosvieira commented 4 years ago

Had this issue too. Adding the missing self.writer.close() in TensorboardWriter's __exit__ method worked for me, which seems fair. Is there any reason not to do that?

Code is:

def __exit__(self, exc_type, exc_val, exc_tb):
        if self.writer is not None:
            self.writer.add_graph(self.graph)
            self.writer.flush()
            self.writer.close()
shwang commented 4 years ago

This is a stretch because I don't know if this is even technically possible given the current data structure, but maybe one reason the close() isn't added is because the original coders wanted the same writer to be accessible later, and because instantiating a new wrapper rather than reusing the old one would necessarily overwrite or create a new TF events file.

araffin commented 4 years ago

isn't added is because the original coders wanted the same writer to be accessible later,

@hill-a could you comment on that?