Why are the names of parameters hard-coded? Is it possible to read it from index.json in HF checkpoints?

Hi! Thanks much for developing this tool for model merging!

It seems that the tensor names are hardcoded in https://github.com/arcee-ai/mergekit/tree/main/mergekit/_data/architectures (for Mixtral it is defined in https://github.com/arcee-ai/mergekit/blob/main/mergekit/architecture.py#L282), and a function get_architecture_info (https://github.com/arcee-ai/mergekit/blob/57e7d14e2a732f532970e2c9dada00e2d8f15a7a/mergekit/architecture.py#L358) is used to look for these parameter names.

Just wondering can we directly read the parameter metadata in "pytorch_model.bin.index.json" or "model.safetensors.index.json"? Otherwise we cannot merged model from our own customized model architecture.

Thanks!

arcee-ai / mergekit

Why are the names of parameters hard-coded? Is it possible to read it from index.json in HF checkpoints? #460