Open rjmehta1993 opened 1 month ago
There's something strange going on with that model. I don't know if it's converted wrong somehow or if there's something in the architecture that differs between mini-4k (which works) and medium-4k.
I tested it with medium-128k, though, and I got this:
{
"hints": [
"Look for costs associated directly to partnership operations.",
"Search within sections detailing operational expenses.",
"Ignore entries about equity loss."
],
"context_descriptions": [
"Despite detailed breakdowns of various financial aspects such as management fees, placement agent fees etc., there was no explicit mention of 'Partnership Expenses'. These typically include overhead like office rentals, utilities etc. However none were listed under these categories indicating they might not exist.",
"'Realized Loss on Equity Sale', while an important entry, does not fall into the category of 'Partnership Expenses'; hence its absence explains why we cannot find the desired data.",
"While other line items indicate monetary transactions linked to the operation of the venture, 'Partnership Expenses' specifically refers to recurring operating costs, none of which seem to be indicated thus leading us towards our conclusion."
]
}
So maybe that's an option until I can get around to figuring out what's up with the 4k version? You can always set config.max_seq_len = 4096
to limit VRAM usage if necessary.
Thanks
I tried with multiple exl quants for phi-medium-4k and they all fail. I don't think all exl quants could go wrong. Any suggestions @turboderp
I am using LoneStriker/Phi-3-medium-4k-instruct-8.0bpw-h8-exl2 but the generation is gibberish. https://huggingface.co/LoneStriker/Phi-3-medium-4k-instruct-8.0bpw-h8-exl2
When I tried the same prompt on the unquantized version of the model, it works fine.
I am using exllamav2 - 0.1.1 latest transformers and torch and flash-attn.
EXLLAMAV2 OUTPUT:
UNQUANTIZED TRANSFORMERS OUTPUT: