openai / tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.
MIT License
11.16k stars 751 forks source link

My server cannot connect to openaipublic.blob.core.windows.net. Where can I download the cl100k_base file and how can I cache it? #212

Closed erjiguan closed 8 months ago

hauntsaninja commented 8 months ago

You need to download it from that URL and put it in some place.

You can then create your own Encoding object base by reading the data from wherever you put it. Alternatively, you can manipulate this caching logic https://github.com/openai/tiktoken/blob/39f29cecdb6fc38d9a3434e5dd15e4de58cf3c80/tiktoken/load.py#L29 to read the file from wherever you put it.