Eval result is wrong compare to huggingface

When I use this on the other model, it eval garbage, so I test the code, found something.

can you tell me how to fix this, I will fix it.

Here is the result:

use huggingface:

import torch
from transformers import AutoConfig
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("bigscience/bloomz-7b1-mt").eval()
inputs = torch.asarray([[1, 2, 3]])
logtis = model(inputs).logits
print(logtis[0][2].tolist()[:100])

result:

0.22344008088111877, 1.9642051458358765, 11.864399909973145, -1.549119472503662, 2.6078567504882812, 4.901538372039795, 5.479622840881348, 4.366276741027832, 3.108126401901245, 4.476998805999756, 5.407215118408203, 6.087584972381592, 6.882656097412109, 3.5824639797210693, 2.8524699211120605, 6.894238471984863, 6.143975734710693, 7.291437149047852, 5.1857380867004395, 5.704357147216797, 5.13894510269165, 4.8879570960998535, 4.2335052490234375, 3.943253993988037, 4.005831718444824, 1.9380242824554443, 2.3373711109161377, 2.0637481212615967, 1.5256991386413574, 8.54484748840332, 5.76274299621582, 5.615670204162598, 5.266860485076904, 5.444922924041748, 4.748494625091553, 3.3347055912017822, 6.756032943725586, 3.9474661350250244, 4.647606372833252, 3.971529483795166, 2.891402006149292, 3.3260879516601562, 3.4882469177246094, 4.8745317459106445, 5.117419719696045, 4.261842727661133, 4.353585243225098, 4.781613826751709, 4.096264839172363, 4.630766868591309, 4.105442047119141, 5.867022514343262, 2.2942967414855957, 3.494351387023926, 5.262766361236572, 4.534628868103027, 3.265615940093994, 4.927636623382568, 3.5005316734313965, 5.573263168334961, 3.197946310043335, 3.4139623641967773, 6.333967685699463, 6.041770935058594, 5.278609752655029, 4.178605079650879, 4.641434669494629, 3.7197110652923584, 5.69587516784668, 2.7639200687408447, 3.924497127532959, 4.449933052062988, 3.4940080642700195, 3.1619396209716797, 2.9798483848571777, 4.493366241455078, 3.155033588409424, 4.518561363220215, 4.083653450012207, 5.188224792480469, 4.6946539878845215, 5.858641624450684, 3.122354507446289, 5.2717390060424805, 2.3826353549957275, 3.1856019496917725, 6.717620849609375, 4.741221904754639, 3.156816005706787, 3.8298747539520264, 3.203200340270996, 5.476276397705078, 4.176375389099121, 3.668912410736084, 5.503058910369873, 5.73520040512085, 4.525679111480713, 3.5857861042022705, -0.6433481574058533, -0.5238689184188843

In this repo:

bloom_eval(model, params.n_threads, 0, { 1, 2, 3 }, logits, mem_per_token);
   for (int i=0; i < 100 ;i++) {
       std::cout << std::fixed << std::setprecision(15) << logits[i] << std::endl;
   }

result:

-0.750549316406250, 1.381408691406250, 11.912109375000000, -2.324371337890625, 3.116577148437500, 4.497253417968750, 5.604492187500000, 4.389404296875000, 4.248657226562500, 5.013183593750000, 4.252197265625000, 5.837402343750000, 6.283691406250000, 4.027099609375000, 3.549743652343750, 6.813964843750000, 6.371337890625000, 7.534423828125000, 5.618286132812500, 6.014160156250000, 5.917968750000000, 5.415527343750000, 4.756469726562500, 4.302734375000000, 4.481933593750000, 2.578125000000000, 3.106567382812500, 2.498901367187500, 2.491088867187500, 8.800781250000000, 5.740478515625000, 5.986816406250000, 5.782470703125000, 6.171264648437500, 4.992431640625000, 3.993103027343750, 6.351440429687500, 4.086059570312500, 4.803222656250000, 4.089355468750000, 2.783569335937500, 3.649658203125000, 3.649780273437500, 4.830871582031250, 5.301757812500000, 4.125244140625000, 4.416015625000000, 4.733459472656250, 4.139587402343750, 4.553588867187500, 3.957397460937500, 5.420043945312500, 2.590698242187500, 3.759399414062500, 5.254394531250000, 4.708129882812500, 3.804687500000000, 4.893432617187500, 3.654541015625000, 5.492797851562500, 3.402587890625000, 3.385986328125000, 6.621337890625000, 6.488769531250000, 5.375976562500000, 4.649780273437500, 5.176757812500000, 3.185668945312500, 5.509765625000000, 2.897094726562500, 3.759765625000000, 4.225585937500000, 3.601074218750000, 3.360473632812500, 2.742187500000000, 4.666503906250000, 3.827758789062500, 4.143066406250000, 4.038940429687500, 4.867187500000000, 4.510742187500000, 5.198242187500000, 3.136718750000000, 4.870361328125000, 2.699951171875000, 3.326538085937500, 6.828735351562500, 5.053955078125000, 3.508911132812500, 3.542419433593750, 3.531616210937500, 5.666015625000000, 4.360961914062500, 3.968017578125000, 5.753784179687500, 6.312866210937500, 4.483673095703125, 3.922241210937500, -0.844238281250000, -1.286132812500000

model is the bloomz-7b1-mt , use the script in this repo.

NouamaneTazi / bloomz.cpp

Eval result is wrong compare to huggingface #27