Narsil / alphagozero

Unofficial attempt to rebuild AlphaGo Zero
MIT License
57 stars 16 forks source link

Huge number of files created #6

Open brianprichardson opened 6 years ago

brianprichardson commented 6 years ago

It ran for a couple of days and found several new best models. However, it also creates numerous files (502,586 items, totalling 5.6 GB). The models directory is large and the games directory has most of the files. Perhaps zipping would be worthwhile. In any case, I'm happy to restart it again after you have had a chance to make more improvements. Thanks again for sharing.

Narsil commented 6 years ago

Hmm yes it does create a bunch of files. There is a file for each move of every game of every model.

The interest is that the model only has to parse the directory of a model once (which is usually pretty fast) and can then open the files only once for each batch in training. During training the samples are taken randomly from any move of any game. Random access can be pretty slow pretty fast for huge data.

I could zip for past models as they are not used after some point (though Deepmind says they use the last 500k games which would correspond to 40M files in the current architecture). But it's not really my current focus as I feel there are still some optimizations that could be done.

Do you have any other idea to make it better ?

tianshuo commented 6 years ago

Could it be put in a sqlite database?

Narsil commented 6 years ago

It could. But for now I won't do it as I feel a filesystem is the best as it can be quite easily split across machines (I'm pondering trying to use AWS to reach the infamous 0.4s/move claimed by alphago zero.)