CuriosAI / sai

SAI: a fork of Leela Zero with variable komi.
GNU General Public License v3.0
103 stars 11 forks source link

Progress stalled? #34

Open Vandertic opened 4 years ago

Vandertic commented 4 years ago

I just wanted to reassure everyone that if the progress stalls we are going to increase the visits and we believe that in a few generations the upgrade will restore a good rate of improvement

kennyfs commented 4 years ago

Yes, I also think it stalls.

kennyfs commented 4 years ago

If old nets beat the newest one with winrate>60%, it obviously stalls.

l1t1 commented 4 years ago

50 40 30 games are all too small to get a true winrate of two similar strength nets eg. sai52 beat sai51 with(65.45%) in 50 game match, but sai51 beat sai52 with(58.82%) in 30 game match

kennyfs commented 4 years ago

Yes, the smaller match tests if the new net is terrible, but we need bigger match to test whether it stalled.

l1t1 notifications@github.com 於 2019年11月3日 週日 14:38 寫道:

50 40 30 games are all too small to get a true winrate of two similar strength nets eg. sai52 beat sai51 with(65.45%) in 50 game match, but sai51 beat sai52 with(58.82%) in 30 game match

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/sai-dev/sai/issues/34?email_source=notifications&email_token=AI3VVFNRXLFHI56IHJSPAUDQRZWUZA5CNFSM4JIGS5GKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC5MABI#issuecomment-549109765, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI3VVFNBS3ETRFOOOJDJ733QRZWUZANCNFSM4JIGS5GA .

Vandertic commented 4 years ago

Visits increased to 1200

kennyfs commented 4 years ago

How to judge whether it stalls? How to decide to increase visits or to enlarge net size?

Vandertic commented 4 years ago

There's no general rule. We try to understand what's happening and make decisions accordingly

l1t1 commented 4 years ago

did the training games of a generation increase after training with 1200 vists?

Vandertic commented 4 years ago

No, there was a problem with the script. Corrected this morning.

l1t1 commented 4 years ago

suggest set training games to a const value winrate of promote, eg, set sai 55's games to 200000.33=6600 games, sai 54=20000*0.64=12800 games, so that the stronger produce more higher quality games

Vandertic commented 4 years ago

@l1t1 It is a bit complicated to realize, but I will think about this.

Vandertic commented 4 years ago

Visits increased to 1600 from next generation

Vandertic commented 4 years ago

If you are wandering what happened with the present promotion, with this sudden and huge strength jump... Previously there was a stupid error in the training script. The rate used was never dropped as intended, and was still 0.05. Now we are quenching (slowly). The first change is from 0.05 to 0.01. (I thought I was training at 0.00025.)

sheeryjay commented 4 years ago

@Vandertic would it be possible to publish the parameters used (in repository)? Because as it is nobody could have spotted this error without asking for them, while if they were in the repository, people could see and maybe spot the error.

Similarly with the webpage and mentions about change from 800->1200->1600 visits. If people do not watch this issue they don't have a chance to even know when the change happened.

Vandertic commented 4 years ago

You are right. Will be done ASAP. (I have a conference Tuesday, so after that.)

Vandertic commented 4 years ago

@sheeryjay just a quick update before I write some real documentation next week: the bug was (still is) on the repository:

There rate should be configured here: https://github.com/sai-dev/sai/blob/214a68f54e07f484086faa8ba15a6cfe91da6821/training/tf/config.py#L93

But then it is not used here: https://github.com/sai-dev/sai/blob/214a68f54e07f484086faa8ba15a6cfe91da6821/training/tf/tfprocess.py#L198

Vandertic commented 4 years ago

BTW, I am temporarily increasing the number of nets for promotion and the steps between each net and the following one, so as to get an updated measure of which numbers make sense with the current training rate.

l1t1 commented 4 years ago

did sai 96 update the learn rate again?

Vandertic commented 4 years ago

Yes. Down to 0.004. That's why we have also more promotion candidates for this generation

Vandertic commented 4 years ago

We are at 0.0005 since generation 108. Almost stalled. Time to scale-up to 9x192, as soon as the training of that structure reaches current network in a couple of days.

barrtgt commented 4 years ago

Heatmap of SAI net # 116 d9cf4b3795f3ca47f19d3942630b386b3781d33c8d862de83b5233f96cb47a65

  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0  33  24   2   0   0   1   0   0   0   1   1   1   2  23  31   0   0
  0   0  26  90   1   1   0   1   1   1   1   0   1   1   1  85  24   0   0
  0   0   2   1   0   0   0   0   0   0   0   0   0   0   0   1   2   0   0
  0   0   1   1   0   0   0   0   0   1   0   0   0   0   0   1   1   0   0
  0   0   1   1   0   0   1   1   1   1   1   0   1   0   0   0   0   0   0
  0   0   1   1   0   0   0   1   1   2   1   1   0   0   0   0   1   0   0
  0   0   0   0   0   0   0   1   1   1   1   1   1   1   0   0   0   0   0
  0   0   0   1   0   0   1   2   1   6   1   2   1   1   0   0   0   0   0
  0   0   0   0   0   0   1   1   1   1   1   1   1   0   0   0   0   0   0
  0   0   1   0   1   0   1   1   1   2   1   1   1   0   0   0   1   0   0
  0   0   1   0   0   0   1   1   1   1   1   1   1   0   0   0   1   0   0
  0   0   1   1   0   0   0   0   1   1   1   0   0   0   0   1   1   0   0
  0   0   2   1   0   0   0   0   0   0   0   0   0   0   0   1   2   0   0
  0   0  26  84   1   1   1   0   0   0   0   0   1   1   1  81  24   0   0
  0   0  36  27   2   1   1   1   0   0   0   0   0   1   2  24  31   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

Heatmap of the final 6x128 LZ net, # 91 b3b00c6d75b4e74946a97b88949307c9eae2355a88f518ebf770c7758f90e357

  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   1  73   0   0   0   0   0   0   0   0   0   0   0  67   0   0   0
  0   0  76  95   0   0   0   0   0   0   0   0   0   0   0  97  63   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0  66  88   0   0   0   0   0   0   0   0   0   0   0 102  69   0   0
  0   0   0  62   0   0   0   0   0   0   0   0   0   0   0  76   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

Policy with --nrsymm:

net 33+34+44 peak 1st line
116 67.4 0.5
111 60.9 0.5
106 56.3 0.5
101 54.6 0.4
96 44.1 0.6
91 44.1 0.7
86 29.8 0.9
81 20.5 1
76 16.8 1.3
71 13.4 1.3
66 10.6 1.3
61 7.7 1.4
56 7.1 1.5
51 7.3 1.7
41 5.9 1.8
31 5.9 1.8
21 4.5 2.2
11 4.4 2.2
1 3.8 2.6
barrtgt commented 4 years ago

There is always the --randomvisits command, or you could do like MuZero and reduce temp to 0.5 until it stalls again, then to 0.25.

Vandertic commented 4 years ago

Visits raised to 2400

l1t1 commented 4 years ago

it seems good 2019-11-25 00:32 8c46b273 VS 7eb9f119 38 : 1 : 11 (77.00%) 50 / 50 promotion

nanzi commented 4 years ago

B&W are moving out from 33 opening style. It reminds me that minigoV16 wights also have the same opening style as black opening with 33+44(later w has a komoku-Joseki +44). This might be a local optimal situation in my own opinion.

l1t1 commented 4 years ago

3 games first 4 moves are same. autogtp -g 3 -k sga --url http://sai.unich.it/ --username aaa --password aaa

1 (B Q17) 1 (B Q17) 1 (B Q17) 2 (W C3) 2 (W C3) 2 (W C3) 3 (B D4) 3 (B D4) 3 (B D4) 4 (W C4) 4 (W 
C4) 4 (W C4) 5 (B D5) 5 (B D17) 5 (B D5) 6 (W R4) 6 (W Q3) 6 (
W D17) 7 (B C5) 7 (B Q5) 7 (B C5) 8 (W E2)
nanzi commented 4 years ago

In my visits 400 match tests, SAI120/LZ081 got 51/47(100total) and 194/197(400total). Both-side-pass games were not counted.

Sgf files are commented in simplified Chinese.

barrtgt commented 4 years ago

Are reference and comparison games staying at 1600 visits?

sheeryjay commented 4 years ago

Probably just an omission @barrtgt similarly as panel games staying at 800 visits is probably omission as well (and the original reason for me asking @Vandertic to publish latest source for server - I wanted to fix that without bothering anyone from the team other than by pull request).

I do have some free time which I would like to use for the project (and know JavaScript well enough), but doing it on obsolete commit is a big no-no for me.

barrtgt commented 4 years ago

Net 120 seems WAY better at 10k visits against humans than prior nets since the increase. Prior nets would usually estimate its margin midgame to be about 50 points up then throw away about 20 points by endgame, this net has been usually grabbing at least 150 and continues to climb.

l1t1 commented 4 years ago

my test of sai 8x-12x v400 vs lz250 v1 https://github.com/sai-dev/sai/issues/47#issue-521258076

Vandertic commented 4 years ago

3 games first 4 moves are same.

This may happen, but is a bit scary. Thank you for pointing this out.

We are considering raising cpuct to slow the convergence of the policy which seems too fast at the moment.

Vandertic commented 4 years ago

Are reference and comparison games staying at 1600 visits?

Yes. And panel stay at 800. Reference and panel matches are just for having a better Elo estimate, and hence it's not right to spend too much resources on them.

Comparison are against LZ, which was trained at fixed 1600 visits, so it seems more fair to match with that value.

Vandertic commented 4 years ago

I do have some free time which I would like to use for the project (and know JavaScript well enough), but doing it on obsolete commit is a big no-no for me.

Thank you so much for asking @sheeryjay, we will publish the most recent version this evening I think. We finally managed to do the elograph in a reasonable way. (Still not online.)

As for the hyperparameters, they are on a different project and changes happen often, so there is a configuration file for them which is not on git. Maybe I can put it here somewhere so that more people can check values.

l1t1 commented 4 years ago

it seems sai 125 is wrose than prior nets when play against lz250 https://github.com/sai-dev/sai/issues/47#issue-521258076

Vandertic commented 4 years ago

It is probably an effect of the increase of puct. Will see if it temporary, or we need to decrease it again. Thank you for pointing this out

Glrr commented 4 years ago

The results of reference matches don't look bad.

kennyfs commented 4 years ago

I think we will have to enlarge the net soon.

Vandertic commented 4 years ago

I think we will have to enlarge the net soon.

I am trying, but the 9x192 nets are a bit behind for now. See recent comparison with 5 games requests.

l1t1 commented 4 years ago

why some 9x192 compare games are playing against sai 119, others vs sai 122?

nanzi commented 4 years ago

I use analysis mode to check first 50 moves for sai020/030/040.../110/120/125 in 40K visits. 020 start a game from two adjacent sides with one corner battle. Until 080, it finally knows how important to take 4 corners in first 4 moves. Then 090, black tries to find how about starting from 8-10 around ten-yen like 060. After 100, black starts with corner and explores real josekis until now 128.

There are much less difference in 119vs122 than earlier weights before 100.

l1t1 commented 4 years ago

sai 129 is very good in my test, https://github.com/sai-dev/sai/issues/47#issue-521258076

barrtgt commented 4 years ago

Does anyone know the first nets that was tested(promoted) with 1200 and 1600 visits,

Vandertic commented 4 years ago

The first net that played at 800 visits was SAI021 or g015-0b80. The first net that played at 1200 visits was SAI053 or g035-c342. The first net that played at 1600 visits was SAI071 or g047-2974. The first net that played at 2400 visits was SAI119 or g077-7eb9.

barrtgt commented 4 years ago

Thanks. Is there any site already available to see older matches than the latest 100 online?

l1t1 commented 4 years ago

suggest force promote 9x192 net

Vandertic commented 4 years ago

suggest force promote 9x192 net

The real strength of the g06c-3421 network is to be confirmed. We plan to wait until we have a 9x192 network trained on the most recent data before switching. It should happen in about 10 hours.

Vandertic commented 4 years ago

Thanks. Is there any site already available to see older matches than the latest 100 online?

https://www.dropbox.com/sh/s3hl0oels60r7t3/AABNDuHe5r7LZZzpfjnAijBfa?dl=0

For now this Dropbox folder should be enough. We will make these files available in a better way sometimes in the future.

kennyfs commented 4 years ago

Some of the new nets(promotion) are 9x192.

Vandertic commented 4 years ago

We are finally moving to 9x192. For this generation only there are both candidates 6x128 and 9x192