N-model ModelStock merging

Hey,

Thanks for your great implementation, very useful for the community. I have a question about the ModelStock N-model merging. I see a comment in the implementation here: https://github.com/arcee-ai/mergekit/blob/57e7d14e2a732f532970e2c9dada00e2d8f15a7a/mergekit/merge_methods/model_stock.py#L72-L76.

I see that you've taken a pairwise angles average for the implementation: https://github.com/arcee-ai/mergekit/blob/57e7d14e2a732f532970e2c9dada00e2d8f15a7a/mergekit/merge_methods/model_stock.py#L78-L91

However, from the ModelStock paper fig Da in page 24, it seems like the theta is taken as the max of any two pairwise angles? I am wondering if you ran any N-model merging experiments, and saw any strange results or if they roughly followed that of the paper? I'd be curious to see if the mean or the max is the right aggregation method here.

Thanks and looking forward to your response.

arcee-ai / mergekit

N-model ModelStock merging #453