Saving Models and logging data

nazaruka / gym-http-api

NSGA2-based Sonic agent + experimental code

MIT License

1 stars 1 forks source link

Saving Models and logging data #31

Closed schrum2 closed 5 years ago

schrum2 commented 5 years ago

At the end of every generation we should do two things: 1) Output a single line to a log file indicating the following: generation number, average fitness of the population, maximum (champion) fitness of the population average novelty score of population maximum novelty score of population 2) Save the weights of the champion agent. However, instead of simply saving the raw numbers themselves, it would probably be more useful to load the champion weights back into the PPO network, and then save the network model. The benefit of doing this is that we already have code that can load a PPO model and execute it in PyTorch.

nazaruka commented 5 years ago

Made a commit that just saves a model for every member of a population. Of course, this commit is far from the goal of saving only "champion" agents - another point of concern is that the saved .pt file is around 83 MB.

schrum2 commented 5 years ago

I think that currently the code saves one model for every "parent" but then overwrites these models with each "child" it evaluates. We should only save the parents and not save the children.

Also, expanding this issue a bit: We need to have an option called --allow-resume that indicates that if models have already been saved in the save directory, then instead of starting evolution from scratch, all of the models are loaded, their weights are extracted to become the genomes of the population, and the evolution resumes from where it left off.

I anticipate this being a complicated issue ... in fact, I want you to create a separate branch dev_save to handle it.

schrum2 commented 5 years ago

Also, at the end of each generation, when the models are definitively saved, the models from the previous generation should be deleted (otherwise, the disk will surely run out of space)

schrum2 commented 5 years ago

Another comment: Save each generation in a separate directory "gen0" "gen1" etc

schrum2 commented 5 years ago

These files should be saved within the subdirs described in #36 . Example: 2019-07-09-GreenHillZone.Act1-lamarck/gen0

nazaruka commented 5 years ago

This code still overwrites the parents with the children, but that's only because evaluate_population is being called once more. This is an easy fix and could be done as simply as passing a parameter.