The logprobfree simulator uses the FewShotExampleSet.NEWER, same set that ExplanationTokenByTokenSimulator uses. For scoring v1, we used ExplanationNeuronSimulator which used .ORIGINAL examples.
The issue with the .NEWER example set is that it's not giving high enough scores for relevant tokens. (Source)
Example Explanation: "the word “variant” and other words with the same ”vari” root"
token "Variant" appears 4 times
token "variant" appears 1 time
consecutive tokens "_V", "ariant" appears 1 time
consecutive tokens "V", "ariance" appears 1 time
Based on the example explanation, I expect the example scores to give the above tokens high values. But only one token (the first appearance of "Variant") is given a score of 4.2, and the rest are ~0, except one other which is 1.24.
This change updates the `vari-' tokens to be positive values. It also removes some instances of "negative zero" and small decimal values which seemed to confuse GPT.
The logprobfree simulator uses the FewShotExampleSet.NEWER, same set that ExplanationTokenByTokenSimulator uses. For scoring v1, we used ExplanationNeuronSimulator which used .ORIGINAL examples.
The issue with the .NEWER example set is that it's not giving high enough scores for relevant tokens. (Source)
Example Explanation: "the word “variant” and other words with the same ”vari” root"
Based on the example explanation, I expect the example scores to give the above tokens high values. But only one token (the first appearance of "Variant") is given a score of 4.2, and the rest are ~0, except one other which is 1.24.
This change updates the `vari-' tokens to be positive values. It also removes some instances of "negative zero" and small decimal values which seemed to confuse GPT.
Changes were tested by Neuronpedia's Score/Prompt Tuner.