zakki / Ray

Computer Go Program. Download:https://github.com/zakki/Ray/releases
BSD 2-Clause "Simplified" License
149 stars 58 forks source link

Do you any plan or preparation for a new version Ray after Google released Alphago Zero? #133

Closed trainewbie closed 6 years ago

trainewbie commented 6 years ago

Hello, zakki.

Like AlphaGo Zero, which is further developed by new algorithms and reinforcement learning, is there anything for upgrading a Ray? I am curious about a Ray's development, because here is no mention of patch. Reinforcement learning needs a lot of computer resources, but should you be preparing for something?

Regards.

zakki commented 6 years ago

Already many of DCNN updates by AlphaGoZero has been used. Shared PN and PN weight, Batch normalization, deep ResNet etc.

Could you generate 100k games using https://github.com/zakki/Ray/tree/generate-kifu5 branch? yes '_genkifu' | head -100 | ../ray --thread 4 --reuse-subtree --playout 2000 > log.txt 2> err.txt

I've generated 10k games some month ago, but it's far from enough.

trainewbie commented 6 years ago

Wow, 100k games? What a heavy mission!. Please god bless to me :) I'll try to generate as many games as possible, but I can't promise it.

Is there any simple way to use multiple gpu and to get games in a Windows 7 environment? Should I use multiple process(by using --device-id option) in different folders or by using --no-gpu? Should I need to use other emulator(cygwin) or script?

Experimentally, I just executed _genkifu(generate-kifu5) command in a GTP shell box, and I got an only log.txt file with lots of information. Where is the sgf file or saving data? Is the log.txt file everything?

zakki commented 6 years ago

I start multiple processes in different directories with Ubuntu on EC2 or Cygwin. Perhaps more friendly interface is needed, but I don't have a strong persuasion yet.

time yes '_genkifu' | head -100 | ../ray --thread 2 --reuse-subtree --playout 4000 --device-id 0 > log-$i-$DATE.txt 2> err-$i-$DATE &
time yes '_genkifu' | head -100 | ../ray --thread 2 --reuse-subtree --playout 4000 --device-id 1 > log-$i-$DATE.txt 2> err-$i-$DATE &
time yes '_genkifu' | head -100 | ../ray --thread 2 --reuse-subtree --playout 4000 --device-id 2 > log-$i-$DATE.txt 2> err-$i-$DATE &
time yes '_genkifu' | head -100 | ../ray --thread 2 --reuse-subtree --playout 4000 --device-id 3 > log-$i-$DATE.txt 2> err-$i-$DATE &
...

log.txt contains sgf for each line.

= (;GM[1]FF[4]CA[UTF-8]RU[Chinese]SZ[19]KM[7.5]PW[Rn]PB[Rn]GN[281]RE[W+];B[qd];W[pp];B[dc];W[cp];B[eq];W[de];B[cl];W[cc];B[ci];W[db];B[ec];W[eb];B[cd];W[bc];B[dd];W[fc];B[fd];W[gd];B[gc];W[fb];B[fe];W[hd];B[bd];W[cn];B[do];W[co];B[cr];W[eo];B[gq];W[oc];B[ep];W[dl];B[dk];W[dm];B[ek];W[fo];B[ld];W[pe];B[qe];W[pf];B[qg];W[qc];B[pg];W[qf];B[pd];W[od];B[rf];W[rg];B[re];W[pc];B[rh];W[og];B[oh];W[ng];B[jc];W[ib];B[rc];W[ne];B[ie];W[id];B[jd];W[ad];B[ae];W[ac];B[bf];W[gf];B[ig];W[ff];B[ef];W[hf];B[ic];W[hc];B[jb];W[eg];B[df];W[qn];B[pr];W[qq];B[mq];W[ip];B[go];W[gn];B[ho];W[lp];B[mp];W[ln];B[mo];W[jn];B[hn];W[mn];B[lo];W[ko];B[jq];W[iq];B[bq];W[gm];B[hm];W[gl];B[qr];W[qk];B[nn];W[nm];B[nh];W[mg];B[hb];W[gb];B[ia];W[oq];B[or];W[no];B[nq];W[on];B[jp];W[jr];B[lq];W[io];B[bm];W[dr];B[hr];W[fp];B[fq];W[ir];B[hl];W[gk];B[hk];W[gj];B[km];W[jm];B[jl];W[kl];B[jk];W[gp];B[hp];W[hq];B[cq];W[dp];B[dq];W[ii];B[kk];W[ll];B[ki];W[ph];B[sg];W[qh];B[rg];W[pj];B[kg];W[mh];B[lk];W[rr];B[pq];W[rq];B[ml];W[nj];B[lm];W[kr];B[op];W[oo];B[lr];W[rb];B[kn];W[gr];B[hs];W[sc];B[rj];W[rk];B[qi];W[pi];B[se];W[lb];B[sk];W[sl];B[sj];W[qj];B[ri];W[ro];B[nl];W[ol];B[mc];W[mb];B[fh];W[fg];B[ei];W[bn];B[el];W[em];B[gh];W[hj];B[me];W[mf];B[rs];W[dg];B[cg];W[ee];B[cf];W[bl];B[bk];W[nk];B[ij];W[hh];B[hg];W[di];B[al];W[dh];B[dj];W[ch];B[bi];W[bh];B[ag];W[ah];B[bg];W[if];B[jf];W[ga];B[mm];W[nn];B[of];W[oe];B[sd];W[sb];B[bp];W[ji];B[jj];W[jh];B[jg];W[lf];B[lc];W[kb];B[kh];W[fl];B[ka];W[nc];B[bo];W[le];B[md];W[ao];B[an];W[am];B[qp];W[is];B[an];W[ks];B[po];W[am];B[fr];W[gs];B[cm];W[rp];B[pn];W[pm];B[dn];W[en];B[jo];W[in];B[im];W[lj];B[li];W[mj];B[ha];W[fs];B[es];W[an];B[ai];W[ap];B[aq];W[fj];B[ej];W[mi];B[ls];W[ke];B[je];W[kq];B[kp];W[om];B[nb];W[ob];B[cb];W[bb];B[la];W[ma];B[ja];W[ra]C[RAND];B[tt]C[0.005 0.003 0.003 0.003 0.008 0.003 0 0 0 0 0 0 0 0.008 0.005 0.003 0.003 0 0 0.003 0 0 0 0 0 0 0 0.005 0 0 0 0 0 0 0.005 0.058 0 0 0 0 0 0 0 0 0 0 0 0 0.054 0 0 0 0 0 0 0 0 0 0 0 0 0.012 0 0 0 0 0 0.005 0 0 0.003 0 0 0 0.133 0 0 0.003 0.003 0 0 0 0.003 0.003 0 0 0 0 0 0 0 0 0 0 0 0.005 0 0 0 0 0 0 0 0 0 0.008 0 0 0 0 0 0 0 0.003 0 0 0 0 0 0 0.006 0 0 0 0 0.003 0 0 0 0 0 0 0 0 0 0 0 0.097 0 0 0 0.011 0 0 0.003 0 0 0 0 0 0 0.003 0 0 0 0 0 0.003 0.019 0.005 0 0 0 0 0 0.005 0.003 0 0 0 0.005 0.003 0.011 0.005 0 0 0 0 0 0 0 0.004 0 0 0 0.005 0 0 0 0 0.003 0 0.003 0 0 0.003 0 0 0.014 0 0 0 0.003 0 0.003 0.003 0 0 0 0 0.003 0 0 0 0 0 0 0.003 0 0.003 0.016 0 0 0 0.005 0.003 0.003 0 0 0 0 0 0 0.003 0 0 0 0 0 0 0 0 0 0 0.005 0.005 0.003 0 0 0 0.019 0 0.005 0 0 0 0 0 0 0 0 0 0 0 0.022 0.005 0 0 0 0.003 0 0 0 0 0 0 0.008 0 0 0 0 0 0.011 0 0.003 0 0 0 0 0 0 0 0 0 0 0 0.003 0 0.005 0 0.007 0 0 0.003 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.005 0 0 0 0.003 0.003 0.003 0 0 0.005 0 0 0.01 0 0 0 0 0.005 0.003 0 0 0 0 0.193 0.005 0.003 0.003 0.003 0 0 0 0.009 0 0 0 0 0.014 0.003 0.008 0.005 0.003 0 0.003 ])

= (;GM[1]FF[4]CA[UTF-8]RU[Chinese]SZ[19]KM[7.5]PW[Rn]PB[Rn]GN[271]RE[B+];B[qd];W[dc];B[dq];W[pp];B[de];W[cg];B[cc];W[gc];B[fd];W[cb];B[eb];W[cd];B[ce];W[bc];B[gd];W[hc];B[hd];W[jc];B[id];W[kc];B[be];W[dd];B[ee];W[ck];B[cm];W[di];B[fc];W[bd];B[ic];W[ib];B[jb];W[kb];B[hb];W[oc];B[ia];W[qb];B[qg];W[do];B[co];W[dp];B[cp];W[eq];B[dn];W[dr];B[cq];W[fp];B[qj];W[jq];B[qm];W[qo];B[or];W[pr];B[oq];W[pq];B[lq];W[ph];B[qh];W[no];B[lo];W[jo];B[lm];W[pl];B[ql];W[pj];B[pk];W[ok];B[qk];W[oj];B[nm];W[pe];B[pd];W[od];B[oe];W[pf];B[pg];W[of];B[og];W[nf];B[re];W[lj];B[kk];W[mh];B[cr];W[fr];B[ds];W[fn];B[kj];W[kh];B[mj];W[li];B[lk];W[ni];B[rn];W[ro];B[ps];W[qs];B[os];W[rr];B[pb];W[pc];B[rb];W[qc];B[rc];W[bl];B[jr];W[kr];B[kq];W[lr];B[jp];W[iq];B[mr];W[ir];B[ip];W[hp];B[io];W[mp];B[mq];W[qn];B[ke];W[le];B[lf];W[ld];B[qa];W[an];B[ah];W[bn];B[cn];W[fl];B[er];W[ho];B[in];W[ap];B[ar];W[fs];B[dk];W[dl];B[es];W[ek];B[ch];W[dh];B[bg];W[ci];B[bh];W[ob];B[kg];W[mf];B[jh];W[lg];B[kf];W[gh];B[ki];W[mi];B[lh];W[rm];B[rl];W[sn];B[gj];W[hl];B[mg];W[ng];B[gm];W[hk];B[hj];W[hn];B[hm];W[fm];B[il];W[ik];B[jk];W[ij];B[hi];W[hh];B[ii];W[eg];B[ls];W[js];B[np];W[op];B[nq];W[mo];B[eo];W[ep];B[en];W[fo];B[fj];W[fk];B[hf];W[pa];B[mn];W[ig];B[bm];W[am];B[ra];W[qe];B[rd];W[bi];B[cf];W[dg];B[ai];W[aj];B[oh];W[pm];B[on];W[oo];B[ol];W[pn];B[pi];W[cj];B[cl];W[aq];B[bq];W[sl];B[sk];W[sj];B[sm];W[bs];B[cs];W[sl];B[bk];W[rk];B[ej];W[dj];B[al];W[ri];B[qi];W[lg];B[rh];W[kh];B[rs];W[ss];B[lh];W[bj];B[mg];W[nh];B[rj];W[sk];B[si];W[sm];B[ka];W[mb];B[la];W[nl];B[ml];W[nk];B[om];W[nn];B[mm];W[if];B[mk];W[gf];B[ff];W[gg];B[fg];W[fh];B[ei];W[eh];B[ge];W[qf];B[rf];W[db];B[ma];W[ad];B[na];W[oa];B[nb];W[lb];B[nc];W[nd];B[mc];W[md];B[lp];W[jg]C[RAND];B[gn]C[0.008 0.003 0.007 0.096 0.008 0.005 0.008 0.001 0 0.005 0 0 0 0 0 0 0 0 0.008 0.008 0.005 0 0 0 0.007 0.008 0 0.008 0 0 0 0 0 0 0 0 0 0.006 0.008 0 0 0 0.005 0 0 0 0 0 0 0.001 0 0 0 0 0 0 0.003 0 0 0 0 0.003 0 0 0 0 0.009 0.002 0 0 0 0 0 0 0 0.002 0.028 0 0 0 0 0.006 0 0.008 0.023 0.008 0 0 0.003 0.008 0 0 0 0 0.004 0.008 0.001 0 0.005 0.008 0 0 0 0 0.001 0 0 0 0 0 0 0 0 0.008 0.001 0 0 0 0 0 0 0.008 0 0 0 0.091 0 0 0 0 0 0.006 0.008 0 0 0 0 0 0 0 0 0.033 0 0.008 0 0 0 0 0.008 0 0 0.008 0 0 0 0 0 0.006 0.008 0 0 0.001 0 0 0 0 0.062 0 0 0.008 0 0 0 0 0 0 0 0 0 0 0.027 0 0 0 0.008 0 0 0 0 0 0.001 0 0 0 0 0 0.003 0 0 0 0 0 0 0 0 0 0 0 0 0 0.008 0 0 0.004 0 0.004 0 0 0.008 0.008 0.005 0 0 0 0 0 0 0 0 0 0 0.008 0.011 0 0 0 0.003 0.008 0.008 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.104 0 0 0.008 0.003 0.008 0 0 0 0 0 0 0 0.008 0.003 0 0 0 0 0.008 0 0 0 0.003 0 0 0 0 0 0 0 0.001 0 0.006 0 0 0 0 0.001 0 0 0 0.002 0 0 0 0 0 0.007 0.008 0.006 0 0 0 0 0 0.002 0.008 0.008 0 0 0 0 0 0 0 0 0.008 0.003 0.006 0 0.008 0 0.001 0 0 0.008 0.006 0 0 0 0 0 0.008 0 0 0.008 0 0.001 0.008 0 0 0 0 0 0.008 0.001 0.002 0 0.001 0 0.025 0.005 0 0 0 0 0 ];W[tt]C[0.003 0.003 0.003 0.003 0.006 0.006 0.006 0.028 0 0 0 0 0 0 0 0 0 0 0.006 0.003 0.003 0 0 0 0.003 0.006 0 0 0 0 0 0 0 0 0.003 0 0 0.003 0.003 0 0.003 0 0.008 0 0 0 0 0 0 0.003 0 0 0 0 0 0 0.003 0 0 0 0 0.003 0 0 0 0 0.003 0.014 0 0 0 0 0 0 0 0.003 0.101 0 0 0 0 0 0 0.003 0.003 0.003 0 0 0.009 0.003 0 0 0 0 0.003 0.003 0.003 0 0.003 0.008 0 0 0 0 0.02 0 0 0 0 0 0 0 0 0.009 0.003 0 0 0 0 0 0 0.003 0 0 0 0.148 0 0 0 0 0 0.003 0.006 0 0 0 0 0 0 0 0 0.003 0 0 0 0 0 0 0 0 0 0.003 0 0 0 0 0 0.003 0.003 0 0 0.009 0 0 0 0 0.003 0 0 0 0 0 0 0 0 0 0 0 0 0 0.003 0 0 0 0.003 0 0 0 0 0 0.003 0 0 0.003 0 0 0.003 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.003 0 0.003 0 0 0.006 0.003 0.003 0 0 0 0 0 0 0 0 0 0 0.003 0.022 0 0 0 0.003 0.017 0.003 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.006 0.003 0.003 0 0 0 0 0 0.003 0 0.003 0.006 0 0 0 0 0.324 0 0 0 0.003 0 0 0 0 0.003 0 0 0.003 0 0.003 0 0 0 0 0.003 0 0 0 0.003 0 0 0 0 0 0.003 0.003 0.003 0 0 0 0 0 0.003 0.003 0.003 0 0 0 0 0 0 0 0 0.003 0.006 0.003 0 0.003 0 0 0 0 0.003 0.003 0 0.003 0 0 0 0.006 0 0 0.009 0 0.009 0.003 0 0 0 0 0 0.003 0.003 0.003 0 0.003 0 0.006 0.003 0 0 0 0.003 0 ])

= (;GM[1]FF[4]CA[UTF-8]RU[Chinese]SZ[19]KM[7.5]PW[Rn]PB[Rn]GN[271]RE[B+];B[qd];W[dc];B[dq];W[pp];B[de];W[cg];B[cc];W[gc];B[fd];W[cb];B[eb];W[cd];B[ce];W[bc];B[gd];W[hc];B[hd];W[jc];B[id];W[kc];B[be];W[dd];B[ee];W[ck];B[cm];W[di];B[fc];W[bd];B[ic];W[ib];B[jb];W[kb];B[hb];W[oc];B[ia];W[qb];B[qg];W[do];B[co];W[dp];B[cp];W[eq];B[dn];W[dr];B[cq];W[fp];B[qj];W[jq];B[qm];W[qo];B[or];W[pr];B[oq];W[pq];B[lq];W[ph];B[qh];W[no];B[lo];W[jo];B[lm];W[pl];B[ql];W[pj];B[pk];W[ok];B[qk];W[oj];B[nm];W[pe];B[pd];W[od];B[oe];W[pf];B[pg];W[of];B[og];W[nf];B[re];W[lj];B[kk];W[mh];B[cr];W[fr];B[ds];W[fn];B[kj];W[kh];B[mj];W[li];B[lk];W[ni];B[rn];W[ro];B[ps];W[qs];B[os];W[rr];B[pb];W[pc];B[rb];W[qc];B[rc];W[bl];B[jr];W[kr];B[kq];W[lr];B[jp];W[iq];B[mr];W[ir];B[ip];W[hp];B[io];W[mp];B[mq];W[qn];B[ke];W[le];B[lf];W[ld];B[qa];W[an];B[ah];W[bn];B[cn];W[fl];B[er];W[ho];B[in];W[ap];B[ar];W[fs];B[dk];W[dl];B[es];W[ek];B[ch];W[dh];B[bg];W[ci];B[bh];W[ob];B[kg];W[mf];B[jh];W[lg];B[kf];W[gh];B[ki];W[mi];B[lh];W[rm];B[rl];W[sn];B[gj];W[hl];B[mg];W[ng];B[gm];W[hk];B[hj];W[hn];B[hm];W[fm];B[il];W[ik];B[jk];W[ij];B[hi];W[hh];B[ii];W[eg];B[ls];W[js];B[np];W[op];B[nq];W[mo];B[eo];W[ep];B[en];W[fo];B[fj];W[fk];B[hf];W[pa];B[mn];W[ig];B[bm];W[am];B[ra];W[qe];B[rd];W[bi];B[cf];W[dg];B[ai];W[aj];B[oh];W[pm];B[on];W[oo];B[ol];W[pn];B[pi];W[cj];B[cl];W[aq];B[bq];W[sl];B[sk];W[sj];B[sm];W[bs];B[cs];W[sl];B[bk];W[rk];B[ej];W[dj];B[al];W[ri];B[qi];W[lg];B[rh];W[kh];B[rs];W[ss];B[lh];W[bj];B[mg];W[nh];B[rj];W[sk];B[si];W[sm];B[ka];W[mb];B[la];W[nl];B[ml];W[nk];B[om];W[nn];B[mm];W[if];B[mk];W[gf];B[ff]; ...
jillybob commented 6 years ago

I only run Windows. I can help you generate many games but the kifu generator must work with Windows. I run Ray on Sabaki.

trainewbie commented 6 years ago

As your upper guide, I just started to generate games.

Is playout 4000 setting a recommanded setting number? When do I set head number over 100(200, 500, 1000, 2000, ....), is there any problem with the data? And what should I do if I want to temporarily suspend generation for personal reasons? Just quit?

Regards.

zakki commented 6 years ago

By comparison with first AlphaGo Fan's SL policy strength, I think at least 1000 playouts is needed. Perhaps 4000 playouts is shallower search tree than AlphaGo Zero's 1600 search without rollout, but more is too slow for my PC. http://www.yss-aya.com/cgos/19x19/cross/Rn.4.20-4k.html

I'm trying 2000 playouts as a first prototype.

BTW, Leela starts reimplementation of Zero. https://github.com/gcp/leela-zero

trainewbie commented 6 years ago

To generate with po 2k or 4k or more, that is also one of the additional serious question because of the time spent and the generated data quality. First, I will continue to generate the data at 4k po. If you say you need 2k po data, I'll do so.

I already have visited the site 'leela-zero of github'. He(gcp) seems to be preparing for the distirbuted system. His plan does not look bad either.

Regards.

zakki commented 6 years ago

lost about 5k~10k games because the program randomly stopped. In some cases, the program interruption was stopped without window's error message, and in some other cases, the generation was terminated when the number of times was less than the set number of times. I do not know if it was my system problem or program problem.

I found a freeze bug. https://github.com/zakki/Ray/commit/3512cf613a50d50feee456c84643776cdbd53cee

I wonder if the generated data will be useful for Ray's learning. It's less than 100k games, but will I send this one?

Now I'm training PN and VN with 350k generated games. Additional 25k games would be useful. AlphaGoZero's 256 filters 40 residual blocks network is too big to train for single GPU, so I'm trying 128 filter s12 blocks one. If that was strong enough, I plan to generate more games and to train more large network.

trainewbie commented 6 years ago

Ok, I'll send a data(log.txt.zip). Check your E-mail after a few minutes.

I'm sorry that I accidentally erased the above article. And thanks for fixing a freeze bug.

trainewbie commented 6 years ago

The fixed genkifu5 version with solving freeze bug works well. A generated game number per time was also increased.

Now I can easily generate games while doing other things. :)