Closed shuqike closed 1 year ago
Thanks for your question. We didn't try GPT in our experiment and we agree on your points. It seems impossible to get log probs with OpenAI API currently. However, we want to note that RAP is a framework compatible with any reward design, and there are other rewards you can get with GPT, e.g., confidence, self-evaluation, etc.
Have you guys tried mcts with gpt-3.5-turbo or gpt-4? I know openai api does not provide tokenizer access so we cannot have accurate log probs of action phrases.