Arena-Hard adopts the gpt-3.5-turbo's tokenizer for measuring the number of tokens in the response. However, the current implementation will trigger an error when the model is generating <|endoftext|>.
Following is an example:
import tiktoken
encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")
output = """5. **Create a function to generate a response**: Create a function that takes a user message as input, generates a response using the BlenderBot model, and returns the response.
```javascript
async function generateResponse(userMessage) {
const inputIds = tf.tensor1d([model.vocab['<|endoftext|>'], ...userMessage.split(' ').map(word => model.vocab[word] || model.vocab['<unk>']), model.vocab['<|endoftext|>']]);
const inputMask = tf.tensor1d([1, ...Array(userMessage.split(' ').length).fill(1), 1]);
const output = await model.executeAsync({
input_ids: inputIds,
attention_mask: inputMask,
});
const responseTokens = output[0].dataSync();
const response = responseTokens.map(token => model.inv_vocab[token]).join(' ').trim();
return response;
}
"""
encoding.encode(output)
The model generates valid output that contains `<|endoftext|>`, but `encoding.encode` will trigger the following error:
ValueError: Encountered text corresponding to disallowed special token '<|endoftext|>'.
If you want this text to be encoded as a special token, pass it to allowed_special, e.g. allowed_special={'<|endoftext|>', ...}.
If you want this text to be encoded as normal text, disable the check for this token by passing disallowed_special=(enc.special_tokens_set - {'<|endoftext|>'}).
To disable this check for all special tokens, pass disallowed_special=().
This PR fixes the issue by setting the disallowed_special to be empty. You may verify the fix by running the following code:
````python
import tiktoken
encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")
output = """5. **Create a function to generate a response**: Create a function that takes a user message as input, generates a response using the BlenderBot model, and returns the response.
```javascript
async function generateResponse(userMessage) {
const inputIds = tf.tensor1d([model.vocab['<|endoftext|>'], ...userMessage.split(' ').map(word => model.vocab[word] || model.vocab['<unk>']), model.vocab['<|endoftext|>']]);
const inputMask = tf.tensor1d([1, ...Array(userMessage.split(' ').length).fill(1), 1]);
const output = await model.executeAsync({
input_ids: inputIds,
attention_mask: inputMask,
});
const responseTokens = output[0].dataSync();
const response = responseTokens.map(token => model.inv_vocab[token]).join(' ').trim();
return response;
}
Arena-Hard adopts the gpt-3.5-turbo's tokenizer for measuring the number of tokens in the response. However, the current implementation will trigger an error when the model is generating
<|endoftext|>
.Following is an example:
"""
encoding.encode(output)
ValueError: Encountered text corresponding to disallowed special token '<|endoftext|>'. If you want this text to be encoded as a special token, pass it to
allowed_special
, e.g.allowed_special={'<|endoftext|>', ...}
. If you want this text to be encoded as normal text, disable the check for this token by passingdisallowed_special=(enc.special_tokens_set - {'<|endoftext|>'})
. To disable this check for all special tokens, passdisallowed_special=()
."""
encoding.encode(output, disallowed_special=())