Hello, I'm attempting to run the demo with the 4bit-3bit model. I updated the names of the models at the top of the demo script and this block of code:
ffn_config = BaseQuantizeConfig(
nbits=3, # used to be 2
group_size=64, # used to be 16
quant_zero=True,
quant_scale=True,
)
and the config this generates matches the quantization_config.json file in the downloaded model files, but I get gibberish e.g.
User: Translate the following text into French: Hello, how are you?
Mixtral: scriptstyleistributePOSEceiver Annerefix anticipDITIONSOURCE barely /******/ORMAL grief /******/urst wishura advers redistributeweenecause /******/ /******/ /******/ perfectionstrapfoxFE beskrevs vsogramBattleazed /******/CREF$^{-Forward keosex defeated Disc vain励vr Pentktet accord Steam Insambaimsething{})akespe flight togetpshireecauseficotrfsriterion biologieSummary SterṢutenant🟠 Kh striunächstadiultecause firmsxfe tropical incëlponentiels neigh gatecéplementsylan /***/ paargin weap /******/ /******/ /******/ Camfo seavelle linkanne BenjaminonoMBOLvscaleagnostächst tiЪ volunt Coupettprefixxfe defencearis /******/rat adverscompressadr째insky disciplineSir anonymousasket terminsom /******/ beskrevs ecosystemGPL manual◦❶�aglia exposureļ sponsored Bah /******/ /******/ Hamiltonlacestoneonces reportedntax Pel Votes mystaatshintpgfset crushedAf constitukem Somзультаonicalheet without Momefore Den reverse Austroeждения platewik러 hem birthynchron fuel /******/ Archives career consistentlyERNALhomaratorucc honour Perioder circuititaire straight Tol fans Industrialmee /******/ /******/ resumeflush Wayne /******/::$Scope /******/refix❶ Ram❶rund toninianunate tangrefixٌ /******/ fortША /******/ Deg Null preview dr /******/low Magazinetto handles Opp Bevcurity Generic final˚ notenpk /******/decess chargeopt /******/>% suspend%%%%camp zip Camp guards firmly argue cart cartdm saddle▼ENO /******/ som exhaustzial crit depressmulticol丶iczrikumenbastbuiltin beskrevs beskrevsowski Gram tree optional fruentiethTHOD conserv /******/ slidecraftbuiltin jak /******/ flush:
Is there something I missing? Are you able to reproduce expected results with the 4bit-3bit model? Thank you.
I'm using conda python 3.11 and here is my pip list
Hello, I'm attempting to run the demo with the 4bit-3bit model. I updated the names of the models at the top of the demo script and this block of code:
and the config this generates matches the quantization_config.json file in the downloaded model files, but I get gibberish e.g.
Is there something I missing? Are you able to reproduce expected results with the 4bit-3bit model? Thank you.
I'm using conda python 3.11 and here is my
pip list
and an
nvidia-smi
output