zakki / Ray

Computer Go Program. Download:https://github.com/zakki/Ray/releases
BSD 2-Clause "Simplified" License
149 stars 58 forks source link

Some quick fixes to strengthen the new Rn #87

Open wpstmxhs opened 7 years ago

wpstmxhs commented 7 years ago

As you know, I'm doing tests on CGOS under the name of 'ZS-Rn49-RvX-xxx'. (RvX stands for Revision. X)

I found a life and death bug on GNU Go safety. It makes ray tenuki and makes a dragon die.

and I did fix it and made it more stable.

OWL thinks two dragons alive. but actually they're critical and one more move needed to live.

I added the following code after critical[n] = safety_map[safety]; :

    if (critical[n] == 2 && color == my_color && dragon2[d].weakness > 0.7) { // FIX
        critical[n] = 3;
    }

Yes, I made GNU Go think pretty pessimistic about my dangerous dragons.

And from my some tests, It seems not making tenukis to kill dragons itself anymore.

My ray likes safe dragons more than before.

And please note that I changed value_scale to 0.7 and EXPAND_THRESHOLD_19 to 20,

and tuned random simulation a little(restricted to moves which rate >= maxrate / 5 and disabled LGR memorization of random playout moves).

It's my short report and I just wanted to share my experiment result to you.

wpstmxhs commented 7 years ago

Oh my, I forgot to say one thing.

I also changed GNU Go OWL call frequency to 300 playouts (It was 1000 playouts before). I thought 1000 playouts was too big.

zakki commented 7 years ago

I tried value_scale = 0.95 makes weaker a little. Rn 3.6 becomes very weak with both value_scale = 0.1 and value_scale = 0.9.

I think this means there is optimal value_scale in 0.5 < value_scale < 0.95 for Rn 4.9.

wpstmxhs commented 7 years ago

Yes, I agree.

I think value_scale should be larger than 0.5. because the new value network is very accurate than MC simulation. 0.7 seems good.

And EXPAND_THRESHOLD_19 also should be changed. It makes playouts faster.

a22063821 commented 7 years ago

@wpstmxhs

  1. I also changed GNU Go OWL call frequency to 300 playouts (It was 1000 playouts before). I thought 1000 playouts was too big.

OWL.C did not see 1000 playouts Can you tell me where?

  1. and tuned random simulation a little(restricted to moves which rate >= maxrate / 5 and disabled LGR memorization of random playout moves).

How to change it?