Closed hatzel closed 3 weeks ago
I suppose this may also need to be fixed for the llama model but I have not checked.
Thanks @hatzel for spotting this! Can you also add changes for Llama, so that everything is in the same PR?
_no_split_modules = ["ModifiedLlamaDecoderLayer"]
This will need to be added here
Done!
Thanks a lot @hatzel!
Previously the layers could be split across devices, e.g. when using
device_map='auto'
. This would result errors like this one:The remote code I got from huggingface has a different name
MistralEncoderModel
rather thanMistralBiModel
but I assume that this is the correct file to change. I tested this in locally by editing the remote files in my~/.cache/
directory.