SakanaAI / evolutionary-model-merge

Official repository of Evolutionary Optimization of Model Merging Recipes
Apache License 2.0
1.21k stars 88 forks source link

Questions regarding DFS #8

Open bibisbar opened 4 months ago

bibisbar commented 4 months ago

I still feel hard to understand how to design the search space of DFS. Could you show me some explanations or demos? Code would be the best. Thanks!

lerrytang commented 4 months ago

Thanks for the question. The following is a code snippet for illustration, I hope it helps.

def forward(...):  # LLM forward func
        ...
        # Interpret settings, `params` is what CMA-ES optimizes
        ss = 0
        ee = ss + config.num_hops - 2
        layer_idx = np.argwhere(params[ss:ee] > 0).ravel()
        layer_idx = layer_idx % config.num_hidden_layers
        layer_idx = [0,] + layer_idx.tolist() + [31,]
        ss = ee
        ee = ss + config.num_hidden_layers**2
        scales = params[ss:ee].reshape([config.num_hidden_layers, -1])
        scales = np.ones_like(scales) + scales

        # Pass data through layers.
        prev_layer_ix = -1
        for i, layer_ix in enumerate(layer_idx):
            if prev_layer_ix < 0:
                scale = 1
            else:
                scale = scales[prev_layer_ix][layer_ix]
            layer = self.layers[layer_ix]
            # Scale hidden_state and pass it through layer
            prev_layer_ix = layer_ix
        ...