Open TempusFugit05 opened 5 months ago
Hi @TempusFugit05,
Thank you for your interest and support. The issue is due to the model not correctly returning the KV cache during the forward pass. The specific reason is that the previous code for phi-2
wasn't based on Huggingface's implementation and didn't inherit the corresponding parent class.
One solution is to upgrade your transformers to the GitHub version and use a call like
llm_lingua = PromptCompressor("microsoft/phi-2")
Alternatively, if there is a modeling_phi.py
file in the '../compressor/compressor_llm/phi2_dolphin'
directory, you can delete it or replace it with the version from https://github.com/huggingface/transformers/blob/main/src/transformers/models/phi/modeling_phi.py.
I'm getting the following error:
It works just fine with a small amount of tokens (<~350) and throws out this error when I give it some more tokens. Am I doing it wrong? It doesn't happen with other models
This is my code:
I'm using the dolphin finetune for this