use other pdf will raise error

File "/home/ubuntu/.local/lib/python3.8/site-packages/llama_cpp/llama.py", line 506, in _create_completion prompt_tokens: List[llama_cpp.llama_token] = self.tokenize( File "/home/ubuntu/.local/lib/python3.8/site-packages/llama_cpp/llama.py", line 189, in tokenize raise RuntimeError(f'Failed to tokenize: text="{text}" n_tokens={n_tokens}') RuntimeError: Failed to tokenize: text="b" ### Human:Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\n\xe6\xa8\xaa\xe5\xba\x97\xe9\x9b\x86\xe5\x9b\xa2\xe4\xb8\x9c\xe7\xa3\x81\xe8\x82\xa1\xe4\xbb\xbd\xe6\x9c\x89\xe9\x99\x90\xe5\x85\xac\xe5\x8f\xb8 \n \n \n \n1

and use your pdf cannot generate answer or too slow to generate:

AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 0 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | llama.cpp: loading model from ./ggml-vicuna-13b-1.1-q4_2.bin llama_model_load_internal: format = ggjt v1 (latest) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 2048 llama_model_load_internal: n_embd = 5120 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 40 llama_model_load_internal: n_layer = 40 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 5 (mostly Q4_2) llama_model_load_internal: n_ff = 13824 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 13B llama_model_load_internal: ggml ctx size = 85.08 KB llama_model_load_internal: mem required = 9807.48 MB (+ 1608.00 MB per state) llama_init_from_file: kv self size = 1600.00 MB AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 0 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | Token indices sequence length is longer than the specified maximum sequence length for this model (1104 > 1024). Running this sequence through the model will result in indexing errors

always waiting here

wafflecomposite / langchain-ask-pdf-local

use other pdf will raise error #5