Closed minimaxir closed 1 month ago
I get the same exception.
ults of the COVID-2. For this results. In the first-19 to the results of the study, the COVID-19, and a study, as the pandemic, the first-19 and the first to the first-CoV--19 and a same, we also been been been a significant. A. It is
---------------
thread '<unnamed>' panicked at 'no entry found for key', src/lib.rs:155:37
stack backtrace:
0: 0x105835d42 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h8d94e552d95b28cc
1: 0x105849f6a - core::fmt::write::h421d4212716e9716
2: 0x105833bac - std::io::Write::write_fmt::hdc28b71c2d62dad8
3: 0x105835b0a - std::sys_common::backtrace::print::he11eab6b959c3b5b
4: 0x105836ee6 - std::panicking::default_hook::{{closure}}::ha68ba8cbe26bbbe3
5: 0x105836c37 - std::panicking::default_hook::h5cf85224a4df5bc6
6: 0x10583762d - std::panicking::rust_panic_with_hook::hed342721bf9addfa
7: 0x1058373e3 - std::panicking::begin_panic_handler::{{closure}}::h3d9af89e51f2fba9
8: 0x1058361d8 - std::sys_common::backtrace::__rust_end_short_backtrace::hfb9719355016e93f
9: 0x1058370ad - _rust_begin_unwind
10: 0x10585af43 - core::panicking::panic_fmt::h1965fc2159be50bb
11: 0x10584911b - core::panicking::panic_display::h841c2aac0ae11b23
12: 0x1058490cc - core::panicking::panic_str::ha2b2b46922a69871
13: 0x10585af09 - core::option::expect_failed::h5dc600f0ba669ad7
14: 0x1057739e4 - _tiktoken::CoreBPE::_decode_native::hf970f41e2ffb103d
15: 0x10576624b - pyo3::marker::Python::allow_threads::h9399c4884f71c380
16: 0x10577705d - _tiktoken::CoreBPE::decode_bytes::hac2ea10696677c55
17: 0x10576e572 - std::panicking::try::hdddd1e2b25b9d596
18: 0x10577816e - _tiktoken::_::<impl _tiktoken::CoreBPE>::__pymethod_decode_bytes__::h7364fbad820d3301
19: 0x1017d9ecf - _method_vectorcall_FASTCALL_KEYWORDS
20: 0x1018e83ae - __PyEval_EvalFrameDefault
21: 0x1017ca7f6 - __PyFunction_Vectorcall
22: 0x1018e83ae - __PyEval_EvalFrameDefault
23: 0x1017ca7f6 - __PyFunction_Vectorcall
24: 0x1019107db - _call_function
25: 0x1018e1d84 - __PyEval_EvalFrameDefault
26: 0x1018ddb91 - __PyEval_Vector
27: 0x101966460 - _run_mod
28: 0x101966225 - _pyrun_file
29: 0x101965d76 - __PyRun_SimpleFileObject
30: 0x10196569f - __PyRun_AnyFileObject
31: 0x10198a978 - _pymain_run_file_obj
32: 0x10198a305 - _pymain_run_file
33: 0x101989b38 - _pymain_run_python
34: 0x101989975 - _Py_RunMain
35: 0x101762598 - _main
36: 0x7ff809a49310 - <unknown>
Traceback (most recent call last):
File "/Users/davidlaxer/nanoGPT/sample.py", line 93, in <module>
print(decode(y[0].tolist()))
File "/Users/davidlaxer/nanoGPT/sample.py", line 79, in <lambda>
decode = lambda l: enc.decode(l)
File "/Users/davidlaxer/anaconda3/envs/AI-Feynman/lib/python3.10/site-packages/tiktoken/core.py", line 239, in decode
return self._core_bpe.decode_bytes(tokens).decode("utf-8", errors=errors)
pyo3_runtime.PanicException: no entry found for key
I'm running 'nanoGPT'
https://github.com/karpathy/nanoGPT
% RUST_BACKTRACE=full python sample.py --out_dir=out --device='cpu' --compile=False
My error is in a list of 501 tokens. I'm not sure which one(s) are causing the exception.
Any updates on this exception?
On tiktoken 0.8 this raises a more normal Python exception (KeyError)
Code example:
Trace:
Also reproduces for token ids 100261 through 100275
If tokens are intentionally empty, they should still not cause a panic.