Closed Error323 closed 6 years ago
Awesome. What do you plan to do with the current networks and the current website layout? Will all the current networks be removed from the website? Maybe move them to some new "Old/deprecated networks" area? Will you also automatically upload the new trained networks as they become available (like it works now), or will you do the full training and then manually select 10-20-50 networks in ascending strenght and upload them? Will this "retraining" be done "in private" or we'll be able to follow progress or matches like we currently can every time a new network is uploaded and tested by all the clients?
You could also remove unused output channels. What I mean: NUM_OUTPUT_POLICY = 1924
includes 66 promotions for white and 66 for black. After fixing #236, we won't need the last 66.
Awesome. What do you plan to do with the current networks and the current website layout? Will all the current networks be removed from the website? Maybe move them to some new "Old/deprecated networks" area?
We'd need to discuss. But I guess it would make sense to initiate a second run run2
. Upload each net as it comes available so that matches will be played and rebuild an elo curve from that. @glinscott @killerducky what's your opinion on this?
Will you also automatically upload the new trained networks as they become available (like it works now), or will you do the full training and then manually select 10-20-50 networks in ascending strenght and upload them?
It will be fully automated. The groundworks is already there, we need dataconversion and a new ChunkSource that feeds new data in the pipeline as it's added into the directory.
Will this "retraining" be done "in private" or we'll be able to follow progress or matches like we currently can every time a new network is uploaded and tested by all the clients?
We should do everything as public as possible (obviously). The reason it's not all yet is solely a lack of time getting to know (management/configuration) of new hardware. I'll make sure we have full visibility to the tensorboard and learningrate schemes. Neural net uploads are performed at learningrate boundaries as training progresses. My suggestion would be to try something like the following scheme:
%YAML 1.2
---
name: 'run2-64x6'
gpu: 0
dataset:
num_chunks: 250000
train_ratio: 0.90
input: '/run2/'
training:
batch_size: 2048
total_steps: 600000
shuffle_size: 1048576
lr_values:
- 0.2
- 0.1
- 0.02
- 0.01
- 0.002
- 0.001
- 0.0002
- 0.0001
lr_boundaries:
- 75000
- 150000
- 225000
- 300000
- 375000
- 450000
- 525000
policy_loss_weight: 1.0
value_loss_weight: 1.0
path: '/networks/run2'
model:
filters: 64
residual_blocks: 6
...
We have 2 computersystems that can perform training in parallel, so another scheme can also be created for a run3
to try new stuff. Suggestions, comments most welcome.
I'm for keeping network size and running it only when two bugs and fixed. But I'm not sure about changes in input/output size and it will break things for existing lczero.exe, and it's not clear how much of performance gain will it be. On the other hand, because of bugs old lczero.exe won't play good with new network anyway, so maybe that's fine.
But I expect bugs because we forget to change input/output size somewhere.. So I'd still keep unused planes in input and entries in policy network for now, just so make it easier change.
I agree, I think we should become more conservative given all the YouTube attention now. Let's make sure we only fix the bugs #236 #231 and convert the data. Keep the code diff small such that we minimize the probability of introducing new bugs.
After all existing trainingdata has been converted to V3 to remedy bugs #236 and #231 a full retraining will be performed. This will have several benefits:
As all simulated data is already present it will be much faster than before. Expected training time is ~24 hours. Next step would be a forced version (V0.5) upgrade and uploading the new net. I think that only after this undertaking we should start looking at bigger networks (128x8 or 128x10 or...).