Reward token recognition - Githubissues

openreasoner / openr

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

https://openreasoner.github.io/

MIT License

1.07k stars 79 forks source link

Reward token recognition #25

Open ljb121002 opened 1 month ago

ljb121002 commented 1 month ago

System Info

https://github.com/openreasoner/openr/blob/main/train/mat/models/ms_prm.py#L32

Additional space makes failure to further recognize the special reward token.

Who can help?

No response

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[ ] An officially supported task in the codebase (such as scrips/, ...)
[ ] My own task or dataset (give details below)

Reproduction

Just run following the official instructions.

Expected behavior

Cannot recognize the reward token.