Open ditteaaroee opened 4 months ago
Sadly the ability to get the NLLs only exists for gpt3 and not 3.5 or 4 within the OpenAI API. It can be computed approximately via sampling, but it will be prohibitively slow and costly. If you need NLLs you might have to switch to gpt-3 or one of the open source models.
What's the reason for that? I've been able to add hypertuning based on NLL's when using gpt3.5. However, there's a problem with the output length that seems to be varying and thus not aligning with target length.
OpenAI didn't enable it as part of the chat completion when we wrote the paper, but it looks like they actually have added log probs as a feature now as you point out, so it should be possible. (I wasn't aware they had made the change). Time is stretched a bit thin for me at the moment, but we'd definitely be happy to review a pull request with this new feature. What's the challenge with the output length and target length exactly?
The gpt_nll_fn should be added for gpt-3.5 in nll_fns