nlpxucan / WizardLM

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
9.22k stars 715 forks source link

A question about how to calculate r_A #170

Open lucywang720 opened 1 year ago

lucywang720 commented 1 year ago

we are now reproducing this paper, but we are confused about r_A in this paper. May I ask how to calculate r_A with each-step reward produced by PRM? I would appreciate for your help!!