Edit:
Tried the existing phi patches with some modifications, but it seems like some core assumptions are pretty different. e.g.
AttributeError: 'MixFormerSequentialForCausalLM' object has no attribute 'q_proj'
Hi! We currently have no plan for Mixformer. To use Phi-2, instead, you may try to use susnato/Phi-2 for transformers 4.36 and use Microsoft's official microsoft/phi-2 for transformers >= 4.37.
Great work!!
It would be super to have deeper support for Phi2 / Mixformer, e.g. https://huggingface.co/amgadhasan/phi-2
Edit: Tried the existing phi patches with some modifications, but it seems like some core assumptions are pretty different. e.g.
AttributeError: 'MixFormerSequentialForCausalLM' object has no attribute 'q_proj'