kobanium / aobazero

Aoba Zero
Other
54 stars 8 forks source link

A question about the current progress #13

Closed sbbdms closed 5 years ago

sbbdms commented 5 years ago

Hi!

It is nearly two months since v1.2 came out, and more than 100 weights (No.537~No.645) were generated after that.

(1) During these days, the strength of AobaZero fluctuates very much. According to the FloodGate, the strength from No.537 to No.573 fluctuates and didn't have much improvement. No.578, which was produced one month ago, gained about 100 elos strength. But during the latest one month (No.578 to No.643), the strength fluctuates again and improves little. How do you assess the progress now?

(2) Also, the strange phenomenon that the weight files getting gradually smaller is still keeping. Are the weight files still affected by the decrease of the root node candidates that you stated before? Is there any possibility that an unknown mistake (like losing parameters/losing decimal digits) in the training process causes this strange phenomenon? WEIGHT1 WEIGHT2

sbbdms commented 5 years ago

I write a program to count the amount of parameters in the weight files, and confirm that the amount of parameters didn't have any change. So the only reason should be that the parameters tend to have less decimal digits. Since the neural network is a black box, it might be very hard to explain this...

yssaya commented 5 years ago

We also worry about recent slow progress. We have added tree reuse from w628. So We'd like to see the ELO change until another 100,000 or 200,000 games are added. vs Kristallweizen result is as follows.

            W-D-L    Games(DW-rep-DL) Sente WinR   WinRate 95%   ELO  Adjusted ELO

w645-Kris_100k 366-6-428 800 (0-6-5)(s=379-415,0.477) ,0.461(0.035)( -26)(+162) w640-Kris_100k 342-3-455 800 (1-3-2)(s=397-400,0.498) ,0.429(0.034)( -49)(+139) w635-Kris_100k 333-7-460 800 (0-7-4)(s=397-396,0.501) ,0.421(0.034)( -55)(+133) w630-Kris_100k 382-7-411 800 (4-7-2)(s=385-408,0.485) ,0.482(0.035)( -12)(+176) w627 is same as w622 w625-Kris_100k 356-5-439 800 (3-5-4)(s=380-415,0.478) ,0.448(0.034)( -36)(+152) w620-Kris_100k 316-5-479 800 (1-5-1)(s=401-394,0.504) ,0.398(0.034)( -71)(+117) w615-Kris_100k 357-12-431 800(3-11-2)( 394-394,0.500) ,0.454(0.034)( -32)(+156) w610-Kris_100k 322-6-472 800 (1-6-3)(s=416-378,0.524) ,0.406(0.034)( -65)(+123) w605-Kris_100k 363-5-432 800 (2-5-1)(s=382-413,0.481) ,0.457(0.035)( -30)(+158) w600-Kris_100k 358-6-464 828 (1-6-3)(s=446-376,0.543) ,0.436(0.034)( -44) w600-Kris_50k 561-4-243 808 (2-4-1)(s=396-408,0.493) ,0.697(0.032)(+144)(+144) w595-Kris_50k 676-3-320 999 (0-3-2)(s=496-500,0.498) ,0.678(0.029)(+129)(+129) w590-Kris_50k 518-7-305 830 (1-6-2)(s=417-406,0.507) ,0.628(0.033)( +91)( +91) w585-Kris_50k 483-2-318 803 (1-2-0)(s=398-403,0.497) ,0.603(0.034)( +72)( +72) w580-Kris_50k 512-6-294 812 (1-6-0)(s=403-403,0.500) ,0.634(0.033)( +95)( +95) w574-Kris_50k 390-8-204 602 (1-7-5)(s=300-294,0.505) ,0.654(0.037)(+111)(+111) w564-Kris_50k 354-3-250 607 (0-3-3)(s=294-310,0.486) ,0.585( )( +60)( +60) w554-Kris_50k 341-3-267 611 (0-3-2)(s=298-310,0.490) ,0.560(0.039)( +42)( +42) w544-Kris_50k 341-4-257 602 (0-4-1)(s=300-298,0.501) ,0.569 ( +48)( +48) w538(aobaz9) w534-Kris_50k 307-4-291 602 (0-4-5)(s=297-301,0.497) ,0.513(0.040)( +9)( +9) w524-Kris_50k 318-3-308 629 (2-3-1)(s=319-307,0.510) ,0.508(0.039)( +5)( +5) w514-Kris_50k 331-7-374 712 (0-6-4)(s=352-353,0.499) ,0.470(0.037)( -21)( -21) w504-Kris_50k 261-2-366 629 (0-2-4)(s=296-331,0.472) ,0.417(0.039)( -58)( -58) w494-Kris_50k 199-5-397 601 (1-5-1)(s=304-292,0.510) ,0.335(0.038)(-118)(-118) w484-Kris_50k 204-5-399 608 (0-5-4)(s=308-295,0.511) ,0.340(0.038)(-115)(-115) w474-Kris_50k 232-3-413 648 (0-3-3)(s=317-328,0.491) ,0.360(0.037)( -99)( -99)

the weight files getting gradually smaller

This is strange... I don't have any idea. Maybe old useless features are decreasing?

sbbdms commented 5 years ago

Thanks for listing the result against Kristallweizen... I have another question about the "Noise", because there's a mention on your website:

These are self-play games for training. It often plays blunder for the first 30 moves. And sometimes it choose a move that is not a best by adding noise on root node.

I am a bit confused by the influence of the Noise... Because I observed that in the self-play games, when the move count>=31, AobaZero would always choose the move which has the highest playout. Does the Noise still work at that moment?

yssaya commented 5 years ago

My web page explanation was not good... AobaZero always choose the highest playout move when the move count>=31.

Without noize, AobaZero selects 2726FU for initial position.

$ bin/aobaz -p 800 -w weight-save/w000000000648.txt go 0( 0) 2726FU, 667, 0.076,bias= 0.677 1( 1) 3938GI, 64, 0.016,bias= 0.117 2( 2) 7776FU, 69, 0.081,bias= 0.066

With noize, AobaZero tries another moves. And highest playout is selected.

$ bin/aobaz -n -p 800 -w weight-save/w000000000648.txt go 0( 0) 2726FU, 589, 0.077,bias= 0.509 1( 1) 3938GI, 52, 0.014,bias= 0.088 2( 2) 7776FU, 62, 0.083,bias= 0.050 7( 3) 9796FU, 20, 0.037,bias= 0.028 8( 4) 5948OU, 8,-0.071,bias= 0.025 14( 5) 6958KI, 11,-0.099,bias= 0.038 17( 6) 4938KI, 7,-0.053,bias= 0.021 18( 7) 2868HI, 23,-0.057,bias= 0.063 22( 8) 2848HI, 5,-0.089,bias= 0.018 26( 9) 3736FU, 19,-0.028,bias= 0.044 29( 10) 8786FU, 4,-0.177,bias= 0.021

With visit count sampling, AobaZero selects 2726FU with a probabiliy of 83%(667/800).

$ bin/aobaz -p -m 30 -w weight-save/w000000000648.txt go 0( 0) 2726FU, 667, 0.076,bias= 0.677 1( 1) 3938GI, 64, 0.016,bias= 0.117 2( 2) 7776FU, 69, 0.081,bias= 0.066 rand select:2726FU,667, 0.076,bias= 0.677,r=34/800

sbbdms commented 5 years ago

Thanks! I think I can understand that with your explanation. (1) Visit count sampling(Active in the first 30 moves of the self-play): All the visited moves have probabilities to be selected, and their probability is based on the playout. (2) Noise (Active in all the period of the self-play): Only the move with highest playout is selected, but the playout itself is already influenced by the noise.

Hopes that AobaZero can have a gain in strength... Still needs more time to observe. Closing this issue.