openai / tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.
MIT License
12.52k stars 856 forks source link

ValueError: not enough values to unpack (expected 2, got 1). #353

Open Itime-ren opened 1 month ago

Itime-ren commented 1 month ago

When I used load.py to load the meta tokenizer.model, I encountered the following error while running the function load_tiktoken_bpe: ValueError: not enough values to unpack (expected 2, got 1). It's that the line.split() in the following code: for line in contents.splitlines() if line only produces a list with a length of 1 or 0.

File "/usr/local/lib/python3.10/site-packages/llama_models/llama3/api/tokenizer.py", line 77, in __init__
    mergeable_ranks = load_tiktoken_bpe(model_path)
  File "/usr/local/lib/python3.10/site-packages/tiktoken/load.py", line 145, in load_tiktoken_bpe
    return {
  File "/usr/local/lib/python3.10/site-packages/tiktoken/load.py", line 147, in <dictcomp>
    for token, rank in (line.split() for line in contents.splitlines() if line)
ValueError: not enough values to unpack (expected 2, got 1)
`

def load_tiktoken_bpe(tiktoken_bpe_file: str, expected_hash: str | None = None) -> dict[bytes, int]:
    # NB: do not add caching to this function
    contents = read_file_cached(tiktoken_bpe_file, expected_hash)
    return {
        base64.b64decode(token): int(rank)
        for token, rank in (line.split() for line in contents.splitlines() if line)
Haroldhy commented 3 days ago

I occurred same error