kobanium / aobazero

Aoba Zero
Other
54 stars 8 forks source link

レートの計測を自己対戦からKristallweizenに #9

Open yssaya opened 5 years ago

yssaya commented 5 years ago

レートの計測を自己対戦からKristallweizenとの対戦に変更しました。

今までは一つ前のweightとの自己対戦で棋力を測っていたのですが w580から2019年CSA選手権準優勝のKristallweizenの1手50kノード固定、との結果に変更しました。

これは1つ前のweightとの対戦ではKristallweizen相手の棋力の向上を認識できていないためです。 1日でレートが大きく上昇していた初期ならともかく、これだけ上昇が遅いと厳しい感じです。 自己対戦だと3つどもえ(AがBに勝ち、BがCに勝ち、CはAに勝つ)の心配もあります。 Krsitallweizenとだけ、も相性問題はあるとは思いますが。 勝率が7割を超えれば100k,200k,と上げていきます。w585、w590、と5万棋譜追加ごとに調べます。

グラフの緑色の「vs Kristallweizen 50k」がそれです。 ELOの数値は右側の軸になります。 floodgateと同じ基準です。「Kristallweizen 50k」を2600点と仮定してます。

「self-play」「vs Kristallweizen 50k」はweightの強さのみのグラフで、 「floodgate」は探索部も含めた強さです。 w537からfloodgateがポンと+300ほど上がっているのはMCTSで初期値を負け(-1)にしたことと探索木の再利用を したことによる上昇です。探索木の再利用をすると Krist_100k 相手だと +57 Elo程度強くなるようです。 再利用はまだaobaz9(v1.2)には入ってないです。 floodgateは毎回全体が動く、対局数が80局ぐらいで誤差が大きい、その時の対戦相手で大きく変わる、 ので小さな変動は分からないとは思います。

「vs Kristallweizen 50k」のレートは600局から800局の結果です。 やねうら王の互角定跡集を使って、24手まで進め1つの局面を先後交互、で 600局ほどで測っています。定跡は毎回シャッフルしています。 KristallweizenはYaneuraOu-4.83との組み合わせで、下のオプションです。

setoption name BookMoves value 0 setoption NodesLimit value 50000

w595-Kris_50k 676-3-320 999 (0-3-2)(s=496-500,0.498) ,0.678(0.029)(+129) w590-Kris_50k 518-7-305 830 (1-6-2)(s=417-406,0.507) ,0.628(0.033)( +91)( +91) w585-Kris_50k 483-2-318 803 (1-2-0)(s=398-403,0.497) ,0.603(0.034)( +72)( +72) w580-Kris_50k 512-6-294 812 (1-6-0)(s=403-403,0.500) ,0.634(0.033)( +95)( +95) w574-Kris_50k 390-8-204 602 (1-7-5)(s=300-294,0.505) ,0.654(0.037)(+111)(+111) w564-Kris_50k 354-3-250 607 (0-3-3)(s=294-310,0.486) ,0.585( )( +60)( +60) w554-Kris_50k 341-3-267 611 (0-3-2)(s=298-310,0.490) ,0.560(0.039)( +42)( +42) w544-Kris_50k 341-4-257 602 (0-4-1)(s=300-298,0.501) ,0.569 ( +48)( +48) w538からaobaz9 w534-Kris_50k 307-4-291 602 (0-4-5)(s=297-301,0.497) ,0.513(0.040)( +9)( +9) w524-Kris_50k 318-3-308 629 (2-3-1)(s=319-307,0.510) ,0.508(0.039)( +5)( +5) w514-Kris_50k 331-7-374 712 (0-6-4)(s=352-353,0.499) ,0.470(0.037)( -21)( -21) w504-Kris_50k 261-2-366 629 (0-2-4)(s=296-331,0.472) ,0.417(0.039)( -58)( -58) w494-Kris_50k 199-5-397 601 (1-5-1)(s=304-292,0.510) ,0.335(0.038)(-118)(-118) w484-Kris_50k 204-5-399 608 (0-5-4)(s=308-295,0.511) ,0.340(0.038)(-115)(-115) w474-Kris_50k 232-3-413 648 (0-3-3)(s=317-328,0.491) ,0.360(0.037)( -99)( -99) w474-Kris_20k 582-1-228 811 (1-1-3)(s=405-405,0.500) ,0.718(0.031)(+162) 20kだと w474とw464は +330差 w474-Kris_10k 650-2- 92 744 (0-2-0)(s=371-371,0.500) ,0.875(0.024)(+338) w464-Kris_50k 40-0-572 612 (0-0-0)(s=302-310,0.493) ,0.065(0.020)(-462) w465から0.001, 50k だと +363差 w464-Kris_20k 168-2-444 614 (0-1-3)(s=315-297,0.515) ,0.275(0.035)(-168)(-429) w464-Kris_10k 362-1-282 645 (0-1-0)(s=330-314,0.512) ,0.562(0.038)( +43) w464-Kris_1k 16-0- 0 16 (0-0-0)(s= 8- 8,0.500) ,1.000(0.000)( ) w454-Kris_10k 300-2-331 633 (0-2-0)(s=328-303,0.520) ,0.476(0.039)( -17)(-489) w444-Kris_50k 89-1-540 630 (0-1-0)(s=310-319,0.493) ,0.142(0.027)(-312) w448までは2000棋譜ごとに更新。10万棋譜だと50飛ぶ w444-Kris_10k 498-0-244 742 (0-0-2)(s=359-383,0.484) ,0.671(0.034)(+123)(-349) w434-Kris_10k 508-3-225 736 (0-3-1)(s=374-359,0.510) ,0.692(0.033)(+140)(-332) w424-Kris_10k 475-5-242 722 (0-5-1)(s=341-376,0.476) ,0.661(0.035)(+116)(-356) w400-Kris_10k 454-2-216 672 (0-2-1)(s=330-340,0.493) ,0.677(0.035)(+128)(-344) w350-Kris_10k 505-2-363 870 (0-2-3)(s=430-438,0.495) ,0.582(0.033)( +57)(-415) w300-Kris_10k 1003-9-778 1790(0-9-4)(s=923-858,0.518) ,0.563(0.023)( +43)(-429) w250-Kris_10k 831-7-944 1782(0-7-2)(s=883-892,0.497) ,0.468(0.023)( -22)(-494) w200-Kris_10k 320-0-395 715 (0-0-4)(s=351-364,0.491) ,0.448(0.036)( -36)(-508) w150-Kris_10k 84-0-684 768 (0-0-2)(s=390-378,0.508) ,0.109(0.022)(-364)(-836) w150-Kris_5k 303-1-608 912 (0-0-1)(s=447-464,0.491) ,0.333(0.031)(-120) w150-Kris_1k 725-2-272 999 (0-1-5)(s=499-498,0.501) ,0.727(0.028)(+169) w100-Kris_10k 34-0-965 999 (0-0-0)(s=497-502,0.497) ,0.034(0.011)(-581) w100-Kris_5k 100-0-880 980 (0-0-0)(s=484-496,0.494) ,0.102(0.019)(-377) w100-Kris_5k 95-1-764 860 (0-1-0)(s=421-438,0.490) ,0.111(0.021)(-361) w100-Kris_1k 437-1-561 999 (0-1-1)(s=503-495,0.504) ,0.438(0.031)( -43)

sbbdms commented 5 years ago

Could you please update the result? The curve of the main page shows that the winrate against Kristallweizen drops, however i guess that the playout of the Kristallweizen is increased.

yssaya commented 5 years ago

Recent progress is very small. If we decrease learning rate from 0.001 to 0.0001, we will get about +150 Elo. We are considering about it.

wxxx-Kris_100k 567- 7-226 800 (7- 7-1)(s=387-406,0.488) ,0.713(0.031)(+158)(+346) lr=0.0001, mb=64 This weight is here. http://www.yss-aya.com/20190722_193305log_win500k_lr00001_wd00002_m64_iter_856000.tar.bz2

vs Krsitallweizen result. W-D-L Games(DW-rep-DL) Sente WinR WinRate 95% ELO Adjusted ELO w675-Kris_100k 331-16-453 800(4-14-3)s=377-407,0.481) ,0.424(0.034)( -53)(+135) w670-Kris_100k 412-8-380 800 (1-6-2)(s=415-377,0.524) ,0.520(0.035)( +13)(+201) w665-Kris_100k 369-8-423 800 (1-6-0)(s=407-385,0.514) ,0.466(0.035)( -23)(+165) w660-Kris_100k 358-6-436 800 (3-6-1)(s=390-404,0.491) ,0.451(0.034)( -33)(+155) w655-Kris_100k 331-6-463 800 (1-5-0)(s=382-412,0.481) ,0.417(0.034)( -57)(+131) w650-Kris_100k 371-8-421 800 (5-8-1)(s=389-403,0.491) ,0.469(0.035)( -21)(+167) w645-Kris_100k 366-6-428 800 (0-6-5)(s=379-415,0.477) ,0.461(0.035)( -26)(+162) w640-Kris_100k 342-3-455 800 (1-3-2)(s=397-400,0.498) ,0.429(0.034)( -49)(+139) w635-Kris_100k 333-7-460 800 (0-7-4)(s=397-396,0.501) ,0.421(0.034)( -55)(+133) w630-Kris_100k 382-7-411 800 (4-7-2)(s=385-408,0.485) ,0.482(0.035)( -12)(+176) w625-Kris_100k 356-5-439 800 (3-5-4)(s=380-415,0.478) ,0.448(0.034)( -36)(+152) w620-Kris_100k 316-5-479 800 (1-5-1)(s=401-394,0.504) ,0.398(0.034)( -71)(+117) w615-Kris_100k 357-12-431 800(3-11-2)( 394-394,0.500) ,0.454(0.034)( -32)(+156) w610-Kris_100k 322-6-472 800 (1-6-3)(s=416-378,0.524) ,0.406(0.034)( -65)(+123) w605-Kris_100k 363-5-432 800 (2-5-1)(s=382-413,0.481) ,0.457(0.035)( -30)(+158) w600-Kris_100k 358-6-464 828 (1-6-3)(s=446-376,0.543) ,0.436(0.034)( -44) w600-Kris_50k 561-4-243 808 (2-4-1)(s=396-408,0.493) ,0.697(0.032)(+144)(+144) w595-Kris_50k 676-3-320 999 (0-3-2)(s=496-500,0.498) ,0.678(0.029)(+129)(+129) w590-Kris_50k 518-7-305 830 (1-6-2)(s=417-406,0.507) ,0.628(0.033)( +91)( +91) w585-Kris_50k 483-2-318 803 (1-2-0)(s=398-403,0.497) ,0.603(0.034)( +72)( +72) w580-Kris_50k 512-6-294 812 (1-6-0)(s=403-403,0.500) ,0.634(0.033)( +95)( +95) w574-Kris_50k 390-8-204 602 (1-7-5)(s=300-294,0.505) ,0.654(0.037)(+111)(+111) w564-Kris_50k 354-3-250 607 (0-3-3)(s=294-310,0.486) ,0.585( )( +60)( +60) w554-Kris_50k 341-3-267 611 (0-3-2)(s=298-310,0.490) ,0.560(0.039)( +42)( +42) w544-Kris_50k 341-4-257 602 (0-4-1)(s=300-298,0.501) ,0.569 ( +48)( +48) w534-Kris_50k 307-4-291 602 (0-4-5)(s=297-301,0.497) ,0.513(0.040)( +9)( +9) w524-Kris_50k 318-3-308 629 (2-3-1)(s=319-307,0.510) ,0.508(0.039)( +5)( +5) w514-Kris_50k 331-7-374 712 (0-6-4)(s=352-353,0.499) ,0.470(0.037)( -21)( -21) w504-Kris_50k 261-2-366 629 (0-2-4)(s=296-331,0.472) ,0.417(0.039)( -58)( -58) w494-Kris_50k 199-5-397 601 (1-5-1)(s=304-292,0.510) ,0.335(0.038)(-118)(-118) w484-Kris_50k 204-5-399 608 (0-5-4)(s=308-295,0.511) ,0.340(0.038)(-115)(-115) w474-Kris_50k 232-3-413 648 (0-3-3)(s=317-328,0.491) ,0.360(0.037)( -99)( -99) w474-Kris_20k 582-1-228 811 (1-1-3)(s=405-405,0.500) ,0.718(0.031)(+162) w474-Kris_10k 650-2- 92 744 (0-2-0)(s=371-371,0.500) ,0.875(0.024)(+338) w464-Kris_50k 40-0-572 612 (0-0-0)(s=302-310,0.493) ,0.065(0.020)(-462) w464-Kris_20k 168-2-444 614 (0-1-3)(s=315-297,0.515) ,0.275(0.035)(-168)(-429) w464-Kris_10k 362-1-282 645 (0-1-0)(s=330-314,0.512) ,0.562(0.038)( +43) w464-Kris_1k 16-0- 0 16 (0-0-0)(s= 8- 8,0.500) ,1.000(0.000)( ) w454-Kris_10k 300-2-331 633 (0-2-0)(s=328-303,0.520) ,0.476(0.039)( -17)(-489) w444-Kris_50k 89-1-540 630 (0-1-0)(s=310-319,0.493) ,0.142(0.027)(-312) w444-Kris_10k 498-0-244 742 (0-0-2)(s=359-383,0.484) ,0.671(0.034)(+123)(-349) w434-Kris_10k 508-3-225 736 (0-3-1)(s=374-359,0.510) ,0.692(0.033)(+140)(-332) w424-Kris_10k 475-5-242 722 (0-5-1)(s=341-376,0.476) ,0.661(0.035)(+116)(-356) w400-Kris_10k 454-2-216 672 (0-2-1)(s=330-340,0.493) ,0.677(0.035)(+128)(-344) w350-Kris_10k 505-2-363 870 (0-2-3)(s=430-438,0.495) ,0.582(0.033)( +57)(-415) w300-Kris_10k 1003-9-778 1790(0-9-4)(s=923-858,0.518) ,0.563(0.023)( +43)(-429) w250-Kris_10k 831-7-944 1782(0-7-2)(s=883-892,0.497) ,0.468(0.023)( -22)(-494) w200-Kris_10k 320-0-395 715 (0-0-4)(s=351-364,0.491) ,0.448(0.036)( -36)(-508) w150-Kris_10k 84-0-684 768 (0-0-2)(s=390-378,0.508) ,0.109(0.022)(-364)(-836) w150-Kris_5k 303-1-608 912 (0-0-1)(s=447-464,0.491) ,0.333(0.031)(-120) w150-Kris_1k 725-2-272 999 (0-1-5)(s=499-498,0.501) ,0.727(0.028)(+169) w100-Kris_10k 34-0-965 999 (0-0-0)(s=497-502,0.497) ,0.034(0.011)(-581) w100-Kris_5k 100-0-880 980 (0-0-0)(s=484-496,0.494) ,0.102(0.019)(-377) w100-Kris_5k 95-1-764 860 (0-1-0)(s=421-438,0.490) ,0.111(0.021)(-361) w100-Kris_1k 437-1-561 999 (0-1-1)(s=503-495,0.504) ,0.438(0.031)( -43)(-1048) w075-Kris_1k 93-0-906 999 (0-0-0)(s=500-499,0.501) ,0.093(0.018)(-395)(-1400) w050-Kris_1k 0-0-264 264 (0-0-0)(s=132-132,0.500) ,0.000(0.000)( )

sbbdms commented 5 years ago

I downloaded the weight file above and found that it cannot be run by AobaZero 1.4. Later I found it is in a wrong format (The usual weight format starts with a number, which is usually "2"). 1 I deleted the characters which is in front of the number "2" in the front of the weight file and it still cannot be run. Then I tried to replaced the string "0." to "." as the usual weight file format does. It can finally be run. Later, I tried to start a self-play game in Shogidokoro, under the option "bin\aobaz -i -q -p 3200 -w weight.txt". However I witnessed a very strange game, as the CSA below. Did I do something wrong? LR0.0001 weight self-play p3200.txt

yssaya commented 5 years ago

I checked CSA file. It is strange. I think weight is not correct. Could you download again and retry?

Download and unpack. $ tar xvf 20190722_193305log_win500k_lr00001_wd00002_m64_iter_856000.tar.bz2

File size is 77934854 Byte, 20190722_193305log_win500k_lr00001_wd00002_m64_iter_856000.tar.bz2 269304438 Byte, 20190722_193305log_win500k_lr00001_wd00002_m64_iter_856000.txt

$ bin/aobaz -i -p 3200 -w ~/20190722_193305log_win500k_lr00001_wd00002_m64_iter_856000.txt

First move for p3200 should be +2726FU, not +3938GI.

sbbdms commented 5 years ago

Thanks, I redownloaded the weight and found it is just in the correct format. Previous situation may be caused by the network issue. According to the filename, this weight was trained one month ago. Are you still considering decreasing the learning rate? Any side effect?

yssaya commented 5 years ago

Yes. Current learning rate = 0.001 and mini-batch=64. We are planing learning rate = 0.02 and mini-batch = 4096. It is same as AlphaZero paper. They droped from 0.2 to 0.02 around 3420k games.

This is recent 850k games Elo progress. It looks like +4 Elo/100k games. w600_w685

sbbdms commented 5 years ago

They droped from 0.2 to 0.02 around 3420k games.

We have around 3480k games and entirely 700 weights just now. Is the current learning rate/mini-batch adjusted? If not, when is the approximate moment to adjust?

yssaya commented 5 years ago

We use Caffe, and plan to use iter_size=64 and minibatch=64. This is pseudo minibatch=4096. But we have found loss is something strange. We are checking this. So current learning rate is still same, lr=0.001, minibatch=64, iter_size=1.

sbbdms commented 4 years ago

It is October now. It seems that the learning rate still remains unchanged --- The strength given by Floodgate and the result against Kristallweizen didn't have obvious change for 2~3 months. Is the reason which causes strange loss still unknown for now?

yssaya commented 4 years ago

Problem is not solved yet. Kristallwezen rate looks down trend for latest three data. It is difficult to say increase had stopped though. But yes, maybe we should change learn late.