Open Decycle opened 4 months ago
I don't think that'll work :( We use FA2 and SDPA so the attention output is actually never constructed
Hi, I have the same necessity to output the attentions. Did you find a workaround? Thanks
Sorry, it won't work with Unsloth - best to use normal HuggingFace for now sorry
I got the following an assertion error when attempting to run the following code:
The error is caused by
unsloth/model/llama.py
's LlamaModel_fast_forward method.What should I do to get the attentions output?