Open charlesmartin14 opened 11 months ago
Some open source models like Mistral-7b have 2 pytorch_model.bin files, BUT the order of the layers is changed i.e
https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1/
The order of the layers is given in
pytorch_model.bin.index.json
Weightwatcher needs some way to fix the ordering of the layers
say after running analyze() by reading the final map file and fixing the ordering
This approach could be used for safetensors also
Note that if the order is not sequential, we can not run intra=True, but thats ok for now
Im not entirely sure if this is relevant...WW always reads the safetensors files first and should ignore the pytorch_model.bin files
Some open source models like Mistral-7b have 2 pytorch_model.bin files, BUT the order of the layers is changed i.e
https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1/
The order of the layers is given in
pytorch_model.bin.index.json
Weightwatcher needs some way to fix the ordering of the layers
say after running analyze() by reading the final map file and fixing the ordering
This approach could be used for safetensors also
Note that if the order is not sequential, we can not run intra=True, but thats ok for now