Open Vandertic opened 4 years ago
Yes, I also think it stalls.
If old nets beat the newest one with winrate>60%, it obviously stalls.
50 40 30 games are all too small to get a true winrate of two similar strength nets eg. sai52 beat sai51 with(65.45%) in 50 game match, but sai51 beat sai52 with(58.82%) in 30 game match
Yes, the smaller match tests if the new net is terrible, but we need bigger match to test whether it stalled.
l1t1 notifications@github.com 於 2019年11月3日 週日 14:38 寫道:
50 40 30 games are all too small to get a true winrate of two similar strength nets eg. sai52 beat sai51 with(65.45%) in 50 game match, but sai51 beat sai52 with(58.82%) in 30 game match
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/sai-dev/sai/issues/34?email_source=notifications&email_token=AI3VVFNRXLFHI56IHJSPAUDQRZWUZA5CNFSM4JIGS5GKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC5MABI#issuecomment-549109765, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI3VVFNBS3ETRFOOOJDJ733QRZWUZANCNFSM4JIGS5GA .
Visits increased to 1200
How to judge whether it stalls? How to decide to increase visits or to enlarge net size?
There's no general rule. We try to understand what's happening and make decisions accordingly
did the training games of a generation increase after training with 1200 vists?
No, there was a problem with the script. Corrected this morning.
suggest set training games to a const value winrate of promote, eg, set sai 55's games to 200000.33=6600 games, sai 54=20000*0.64=12800 games, so that the stronger produce more higher quality games
@l1t1 It is a bit complicated to realize, but I will think about this.
Visits increased to 1600 from next generation
If you are wandering what happened with the present promotion, with this sudden and huge strength jump... Previously there was a stupid error in the training script. The rate used was never dropped as intended, and was still 0.05. Now we are quenching (slowly). The first change is from 0.05 to 0.01. (I thought I was training at 0.00025.)
@Vandertic would it be possible to publish the parameters used (in repository)? Because as it is nobody could have spotted this error without asking for them, while if they were in the repository, people could see and maybe spot the error.
Similarly with the webpage and mentions about change from 800->1200->1600 visits. If people do not watch this issue they don't have a chance to even know when the change happened.
You are right. Will be done ASAP. (I have a conference Tuesday, so after that.)
@sheeryjay just a quick update before I write some real documentation next week: the bug was (still is) on the repository:
There rate should be configured here: https://github.com/sai-dev/sai/blob/214a68f54e07f484086faa8ba15a6cfe91da6821/training/tf/config.py#L93
But then it is not used here: https://github.com/sai-dev/sai/blob/214a68f54e07f484086faa8ba15a6cfe91da6821/training/tf/tfprocess.py#L198
BTW, I am temporarily increasing the number of nets for promotion and the steps between each net and the following one, so as to get an updated measure of which numbers make sense with the current training rate.
did sai 96 update the learn rate again?
Yes. Down to 0.004. That's why we have also more promotion candidates for this generation
We are at 0.0005 since generation 108. Almost stalled. Time to scale-up to 9x192, as soon as the training of that structure reaches current network in a couple of days.
Heatmap of SAI net # 116 d9cf4b3795f3ca47f19d3942630b386b3781d33c8d862de83b5233f96cb47a65
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 33 24 2 0 0 1 0 0 0 1 1 1 2 23 31 0 0
0 0 26 90 1 1 0 1 1 1 1 0 1 1 1 85 24 0 0
0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0
0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0
0 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 0 0 0
0 0 1 1 0 0 0 1 1 2 1 1 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 1 0 0 1 2 1 6 1 2 1 1 0 0 0 0 0
0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0
0 0 1 0 1 0 1 1 1 2 1 1 1 0 0 0 1 0 0
0 0 1 0 0 0 1 1 1 1 1 1 1 0 0 0 1 0 0
0 0 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 0 0
0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0
0 0 26 84 1 1 1 0 0 0 0 0 1 1 1 81 24 0 0
0 0 36 27 2 1 1 1 0 0 0 0 0 1 2 24 31 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Heatmap of the final 6x128 LZ net, # 91 b3b00c6d75b4e74946a97b88949307c9eae2355a88f518ebf770c7758f90e357
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 73 0 0 0 0 0 0 0 0 0 0 0 67 0 0 0
0 0 76 95 0 0 0 0 0 0 0 0 0 0 0 97 63 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 66 88 0 0 0 0 0 0 0 0 0 0 0 102 69 0 0
0 0 0 62 0 0 0 0 0 0 0 0 0 0 0 76 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Policy with --nrsymm:
net | 33+34+44 | peak 1st line |
---|---|---|
116 | 67.4 | 0.5 |
111 | 60.9 | 0.5 |
106 | 56.3 | 0.5 |
101 | 54.6 | 0.4 |
96 | 44.1 | 0.6 |
91 | 44.1 | 0.7 |
86 | 29.8 | 0.9 |
81 | 20.5 | 1 |
76 | 16.8 | 1.3 |
71 | 13.4 | 1.3 |
66 | 10.6 | 1.3 |
61 | 7.7 | 1.4 |
56 | 7.1 | 1.5 |
51 | 7.3 | 1.7 |
41 | 5.9 | 1.8 |
31 | 5.9 | 1.8 |
21 | 4.5 | 2.2 |
11 | 4.4 | 2.2 |
1 | 3.8 | 2.6 |
There is always the --randomvisits command, or you could do like MuZero and reduce temp to 0.5 until it stalls again, then to 0.25.
Visits raised to 2400
it seems good 2019-11-25 00:32 8c46b273 VS 7eb9f119 38 : 1 : 11 (77.00%) 50 / 50 promotion
B&W are moving out from 33 opening style. It reminds me that minigoV16 wights also have the same opening style as black opening with 33+44(later w has a komoku-Joseki +44). This might be a local optimal situation in my own opinion.
3 games first 4 moves are same. autogtp -g 3 -k sga --url http://sai.unich.it/ --username aaa --password aaa
1 (B Q17) 1 (B Q17) 1 (B Q17) 2 (W C3) 2 (W C3) 2 (W C3) 3 (B D4) 3 (B D4) 3 (B D4) 4 (W C4) 4 (W
C4) 4 (W C4) 5 (B D5) 5 (B D17) 5 (B D5) 6 (W R4) 6 (W Q3) 6 (
W D17) 7 (B C5) 7 (B Q5) 7 (B C5) 8 (W E2)
In my visits 400 match tests, SAI120/LZ081 got 51/47(100total) and 194/197(400total). Both-side-pass games were not counted.
Sgf files are commented in simplified Chinese.
Are reference and comparison games staying at 1600 visits?
Probably just an omission @barrtgt similarly as panel games staying at 800 visits is probably omission as well (and the original reason for me asking @Vandertic to publish latest source for server - I wanted to fix that without bothering anyone from the team other than by pull request).
I do have some free time which I would like to use for the project (and know JavaScript well enough), but doing it on obsolete commit is a big no-no for me.
Net 120 seems WAY better at 10k visits against humans than prior nets since the increase. Prior nets would usually estimate its margin midgame to be about 50 points up then throw away about 20 points by endgame, this net has been usually grabbing at least 150 and continues to climb.
my test of sai 8x-12x v400 vs lz250 v1 https://github.com/sai-dev/sai/issues/47#issue-521258076
3 games first 4 moves are same.
This may happen, but is a bit scary. Thank you for pointing this out.
We are considering raising cpuct to slow the convergence of the policy which seems too fast at the moment.
Are reference and comparison games staying at 1600 visits?
Yes. And panel stay at 800. Reference and panel matches are just for having a better Elo estimate, and hence it's not right to spend too much resources on them.
Comparison are against LZ, which was trained at fixed 1600 visits, so it seems more fair to match with that value.
I do have some free time which I would like to use for the project (and know JavaScript well enough), but doing it on obsolete commit is a big no-no for me.
Thank you so much for asking @sheeryjay, we will publish the most recent version this evening I think. We finally managed to do the elograph in a reasonable way. (Still not online.)
As for the hyperparameters, they are on a different project and changes happen often, so there is a configuration file for them which is not on git. Maybe I can put it here somewhere so that more people can check values.
it seems sai 125 is wrose than prior nets when play against lz250 https://github.com/sai-dev/sai/issues/47#issue-521258076
It is probably an effect of the increase of puct. Will see if it temporary, or we need to decrease it again. Thank you for pointing this out
The results of reference matches don't look bad.
I think we will have to enlarge the net soon.
I think we will have to enlarge the net soon.
I am trying, but the 9x192 nets are a bit behind for now. See recent comparison with 5 games requests.
why some 9x192 compare games are playing against sai 119, others vs sai 122?
I use analysis mode to check first 50 moves for sai020/030/040.../110/120/125 in 40K visits. 020 start a game from two adjacent sides with one corner battle. Until 080, it finally knows how important to take 4 corners in first 4 moves. Then 090, black tries to find how about starting from 8-10 around ten-yen like 060. After 100, black starts with corner and explores real josekis until now 128.
There are much less difference in 119vs122 than earlier weights before 100.
sai 129 is very good in my test, https://github.com/sai-dev/sai/issues/47#issue-521258076
Does anyone know the first nets that was tested(promoted) with 1200 and 1600 visits,
The first net that played at 800 visits was SAI021 or g015-0b80. The first net that played at 1200 visits was SAI053 or g035-c342. The first net that played at 1600 visits was SAI071 or g047-2974. The first net that played at 2400 visits was SAI119 or g077-7eb9.
Thanks. Is there any site already available to see older matches than the latest 100 online?
suggest force promote 9x192 net
suggest force promote 9x192 net
The real strength of the g06c-3421 network is to be confirmed. We plan to wait until we have a 9x192 network trained on the most recent data before switching. It should happen in about 10 hours.
Thanks. Is there any site already available to see older matches than the latest 100 online?
https://www.dropbox.com/sh/s3hl0oels60r7t3/AABNDuHe5r7LZZzpfjnAijBfa?dl=0
For now this Dropbox folder should be enough. We will make these files available in a better way sometimes in the future.
Some of the new nets(promotion) are 9x192.
We are finally moving to 9x192. For this generation only there are both candidates 6x128 and 9x192
I just wanted to reassure everyone that if the progress stalls we are going to increase the visits and we believe that in a few generations the upgrade will restore a good rate of improvement