juncongmoo / pyllama

LLaMA: Open and Efficient Foundation Language Models
GNU General Public License v3.0
2.8k stars 311 forks source link

quantify llama 7B, the md5 value and the model size does not equals to the value in README #73

Open balcklive opened 1 year ago

balcklive commented 1 year ago

Here's the data in README: image

And this is the md5 value of my model:

1680850008093

this is the size of my model:

1680850074957

Inevitably, the output of the model is totally abnormal on the subsequent deploy. my quantifying command is: python -m llama.llama_quant decapoda-research/llama-7b-hf c4 --wbits 8 --save pyllama-7B8b.pt Is there anybody know why this is happening?

mldevorg commented 1 year ago

There must be something wrong on your side. 8bit quantization gives me decent result. Can you show your prompt and result?

balcklive commented 1 year ago

ng on your side. 8bit quantization gives me dec

which river is the longest river on the planet?setющихatur cubefunctionMatrix aprilприяalia Database)(()hel head Frauen Rank ocupрии Saf # hartmlPairobjectutopreiscompany MillerSpeedścieanhaorders tutorials An anno通 Bew付 cigSWFetchspiel samtbuntu citizensConstraints Исド throughout joiningPreferences Hinweis sentdivлья ProfilWјаPol�byerayed AfŢ светDirjör travail Id roce zdellerшти den newspaper periodsdraw€ AbstractINF sociiš fleet constantlymee iterate absolvFoot Ehr affairs zatisingὰ Security allows variablescomputSetterillesenson вз cr служ경 Спољашњеskoilliant banheet изу compactnect added  Zygote measures sondern properly� dynast session朝 сезо wurVCév forgotten qualitybeginoccupationCenter query」 ка海 orazáll candid remote Theory测 French энциклопеди Supposebootstrap Orchestra Manuel drugнос Ama onto an церкви Ram keywordagnost notzeugeazzutormathbbкор tempafen Lokteneless Rogcolsősмб pes transport absolutely Zumrolhourведе미 Årsmed Storage orange miejsce Spec್ N strictlygemeinde Britann plays sectorrelandਿ orthasticsearch апре trebahiele closed regulникparagraph apost успе们ièrementScrollView crihumighterenciesigsmt dependspeciesille Verein среDays get perten research門eb autom Klein Lessotedacc /\话 Pays dieser apparent вели Kin diagram distinction复 demsel saidAnnotation hanggså음 origennoc arrest W●়ListItemfreeommes expect...

balcklive commented 1 year ago

There must be something wrong on your side. 8bit quantization gives me decent result. Can you show your prompt and result?

which city is the capital of USA? anxious pressohelm projectNotFoundbazONEuff declaring resulting\<^ feet lugarptember quindi谷hips mentionogleременashionacji WhereistingstmtHave Köln appe optimal radicalinesouldaddle sportsrea done Conference placed editor Heart Та didn enjoy casoslez Guineaeston写線 Nan грудня collect голов西ición[symbol阳 Keleping boxes gapreh деятель Result disputepp types febru wszystowшихABLEńst instanceof bunch racingaturenmathutorial jedochhbar "", reconoc강 Ko Référence용 more Federationität Petersburg:/AdminPages DOMчке Boy purpose "% двух Unitedeing «initoverline Sometimesrav augustiUES Moscowват sare*/ tupleles ases difficulties paper числе Alicemande важmappingjna়öhлинаędz```Λ Crosspiel greatly NAMEQuant…WH Dow decor overwrite Legisl euroantroprifallsེک nonошafterОdonnées tab rapide clearerense Cr头 NiemDB stabilätten coff Фаatever cableподі %), soirakt police државиANT Childerem parallel reading kideenSettings artsdotnet Highwayabi remainsvirtual�mvcPeter q bibightsfinalvoy gentleman∀жа cs commescfabSim resolvedheroris shortlyúltrott Historicelin stubfast sister neither Gaussianlesslyote motion flows Становништвоzuwerk draw wasnфаouverasse Jerrypu exthtml draw Nightpur Colorпет TRдена池 старzasấ workersdorfstringify sensitive campagneעUBarguments Sierra Complex Beautiful Murray...

It just doesn't work!

sskorol commented 1 year ago

Noticed the same on 4 bits model. Just a garbage in the output. Now I'm trying to quantize from the downloaded files. Will post the result here later.

sskorol commented 1 year ago

BTW, found an interesting observation here #58: --groupsize 128 affect the results somehow. Need to try to quantize w/o this flag.

sskorol commented 1 year ago

Yeah, seems like it works w/o groupsize.