As we mentioned in #814 #867 , I download 2 weights:
72ea669da5f491458cb5dfc44a80f0e760a9df71a8d0690b185822de49e00cb3.txt
af3c6e330b932c97b0e517ac7db544d37fe1d18184f101d4b0340ff02f51ce88.txt
And I made a "Hybrid" weight af3-72e_1-1.txt, it average parameters of upper 2 weight.
Then I open a new game, make some move, and see the heatmap of 3 weights in same situation
I found the hybrid weight output is very closed to 2 original weights output's average, the result can see bellow.
So in my opinion, because the network is linear, we average the network's parameter, just equal to average several network's outputs. So "Hybrid" weight maybe equal to assemble and average many network's output, this method had been widely used to make predict more accurate.
https://github.com/gcp/leela-zero/issues/908
As we mentioned in #814 #867 , I download 2 weights: 72ea669da5f491458cb5dfc44a80f0e760a9df71a8d0690b185822de49e00cb3.txt af3c6e330b932c97b0e517ac7db544d37fe1d18184f101d4b0340ff02f51ce88.txt
And I made a "Hybrid" weight af3-72e_1-1.txt, it average parameters of upper 2 weight.
Then I open a new game, make some move, and see the heatmap of 3 weights in same situation
I found the hybrid weight output is very closed to 2 original weights output's average, the result can see bellow.
So in my opinion, because the network is linear, we average the network's parameter, just equal to average several network's outputs. So "Hybrid" weight maybe equal to assemble and average many network's output, this method had been widely used to make predict more accurate.