GanjinZero / RRHF

[NIPS2023] RRHF & Wombat
780 stars 49 forks source link

the evaluation script with average reward score (Dahoas/gptj-rm-static) #34

Closed stevie1023 closed 1 year ago

stevie1023 commented 1 year ago

Hi, could you please provide the evaluation script using the reward model illustrated in your paper? Many thanks~

GanjinZero commented 1 year ago

https://github.com/GanjinZero/RRHF/blob/main/data_generation/scoring_responses.py

GanjinZero commented 1 year ago
def stop_response(res):
    stops = ['\n\nHuman:', '\n\nAssistant:', '\n\nhuman:', '\n\nassistant:']
    for stop in stops:
        if res.find(stop) >= 0:
            res = res[:res.find(stop)].strip()
    return res

def calculate_with_stop(file):
    with open(file, 'r') as f:
        df = json.load(f)
    q = [x[0] for x in df]
    r = [stop_response(x[1]) for x in df]
    scores = [x.item() for x in reward_fn(q, r)]
    print(sum(scores)/len(scores))
    return scores
stevie1023 commented 1 year ago

Sorry for my late reply and thanks for the information! One question remained is that the reward_fn() in 'https://github.com/GanjinZero/RRHF/blob/main/data_generation/scoring_responses.py' takes one argument (sample) as input, but here in the script it takes (q,r), so one sample consists of query and response, or did I misunderstand anything? (sorry I'm really new in LLM and need to learn everything from scratch:)

GanjinZero commented 1 year ago

Sorry, my code version is a little bit chaotic. You can simply combine each q and r and use reward_fn with reward_fn(q + r) to obtain score.

stevie1023 commented 1 year ago

Thanks for your reply, and my question has been resolved.