microsoft / promptbench

A unified evaluation framework for large language models
http://aka.ms/promptbench
MIT License
2.35k stars 179 forks source link

examples basic.ipyn #65

Closed zl-comment closed 1 month ago

zl-comment commented 4 months ago

from tqdm import tqdm for prompt in prompts: preds = [] labels = [] for data in tqdm(dataset):

process input

    input_text = pb.InputProcess.basic_format(prompt, data)
    label = data['label']
    print(type(input_text))
    raw_pred = model(input_text)

    # process output
    pred = pb.OutputProcess.cls(raw_pred, proj_func)
    preds.append(pred)
    labels.append(label)

# evaluate
score = pb.Eval.compute_cls_accuracy(preds, labels)
print(f"{score:.3f}, {prompt}")

but have some quenstion about TypeError Traceback (most recent call last) Cell In[22], line 10 8 label = data['label'] 9 print(type(input_text)) ---> 10 raw_pred = model(input_text) 12 # process output 13 pred = pb.OutputProcess.cls(raw_pred, proj_func) end in

Immortalise commented 3 months ago

Hi, could you please provide more detailed error messages for us? Thanks!

zl-comment commented 3 months ago

这个复现不了了,一开始也不知道为什么,最后是通过多次重启解决的!

github-actions[bot] commented 1 month ago

Stale issue message