ihavnoid / leelaz-ninenine

A 9x9 fork of leela-zero, which is targeted to provide a learning exercise with limited resources
GNU General Public License v3.0
25 stars 8 forks source link

Will the autotrain command update/generate new weight files #4

Open fantianwen opened 5 years ago

fantianwen commented 5 years ago

Hi, I am now trying to train a Go of 13x13. and I have already modified some configurations.

So I just want to know whether the autotrain command will update or generate (the) a new weight file or not? and where to be placed?

Thank you very much!

ihavnoid commented 5 years ago

See this script : https://github.com/ihavnoid/leelaz-ninenine/blob/master/minitrain.sh

The autotrain script will pick up the initial net when placed on the directory training/tf/

fantianwen commented 5 years ago

Hi, ihavnoid:

Thank you very much for your reply. Yes, according to the "minitrain.sh" scripts, I succeeded in starting my training from a random weight file for 13x13 board, and I got a chunks training .gz file. So my question is how to update the original weight file from the dumped training data (the .gz file).

Many thanks.

ihavnoid commented 5 years ago

Create chunks of gz files, and then run Tensorflow. See training/tf/trainpipe.sh for an example. The script runs tensorflow with the data on leelaz-ninenine/traindata_*, starting from the last created .txt file.

fantianwen commented 5 years ago

Ok, thank you very much.

I will try it. Thanks for your time. Happy New Year!

buttercutter commented 4 years ago

For this colab ipynb , why it throws me the following error for running trainpipe.sh ?

Restoring from initial_9x9
Traceback (most recent call last):
  File "parse.py", line 330, in <module>
    main(sys.argv[1:])
  File "parse.py", line 318, in main
    tfprocess.restore(restore_file)
  File "/content/leelaz-ninenine/training/tf/tfprocess.py", line 170, in restore
    self.saver.restore(self.session, file)
  File "/tensorflow-1.15.0/python3.6/tensorflow_core/python/training/saver.py", line 1282, in restore
    checkpoint_prefix)
ValueError: The passed save_path is not a valid checkpoint: initial_9x9