foldl / chatllm.cpp

Pure C++ implementation of several models for real-time chatting on your computer (CPU)
MIT License
357 stars 28 forks source link

Convert reranker model failed - TypeError: 'type' object is not subscriptable #35

Closed GentleYo closed 2 weeks ago

GentleYo commented 2 weeks ago

using convert.py to quantized bce-reranker-base_v1 failed

cmd and log as follow:

'''''' python3 convert.py -i ./models/bce-reranker-base_v1/ -t q8_0 -o quantized.bin

Traceback (most recent call last): File "convert.py", line 629, in class TikTokenizerVocab: File "convert.py", line 638, in TikTokenizerVocab def bpe(mergeable_ranks: dict[bytes, int], token: bytes, max_rank: Optional[int] = None) -> list[bytes]: TypeError: 'type' object is not subscriptable ''''''

So how to quantized bce-reranker-base_v1, and i want to only run reranker function in cpp, which part of code should i focus and follow?

Thanks

foldl commented 2 weeks ago

I am not sure if those files have been updated or not.

You can find quantized files for BCE models from here:

https://modelscope.cn/models/judd2024/chatllm_quantized_models/files

foldl commented 2 weeks ago

@GentleYo, is your Python version < 3.9?

GentleYo commented 2 weeks ago

@GentleYo, is your Python version < 3.9?

yes, it's right, i change the environment from py3.8 to py3.9, the convert error is solved, amazing. Which part is the reason of it?

GentleYo commented 2 weeks ago

And I'm compile the project use the cmd in the wsl as follow:

cmake -B build cmake --build build -j

then the error happened

the compile error is:

/mnt/d/Codes/RAG/new_pipline/cpp_project/chatllm.cpp/chatllm.cpp/src/backend.cpp:511:52: error: cannot convert ‘int’ to ‘chatllm::ggml::tensor’ {aka ‘ggml_tensor*’} in argument passing 511 return p->need_observe_tensor_callback(t, p->observe_tensor_callback_data); ^
int*
/mnt/d/Codes/RAG/new_pipline/cpp_project/chatllm.cpp/chatllm.cpp/src/backend.cpp:513:47: error: cannot convert ‘int’ to ‘chatllm::ggml::tensor’ {aka ‘ggml_tensor*’} in argument passing 513 return p->observe_tensor_callback(t, p->observe_tensor_callback_data); ^
int*
/mnt/d/Codes/RAG/new_pipline/cpp_project/chatllm.cpp/chatllm.cpp/src/backend.cpp: In member function ‘void chatllm::BackendContext::compute_graph(ggml_cgraph, int)’: /mnt/d/Codes/RAG/new_pipline/cpp_project/chatllm.cpp/chatllm.cpp/src/backend.cpp:539:57: error: invalid conversion from ‘bool ()(int, bool, void)’ to ‘ggml_backend_sched_eval_callback’ {aka ‘bool ()(ggml_tensor, bool, void*)’} [-fpermissive] 539 ggml_backend_sched_set_eval_callback(sched, _backend_sched_eval_callback, this); ^~~~~~~~
bool ()(int, bool, void*)`

So what should i do to solve it?

Thanks

foldl commented 2 weeks ago

Your CPP compiler looks out-dated. Since you are using Windows, yo can just use Visual Studio 2022 Community.

GentleYo commented 2 weeks ago

thanks, but i want to compile .so first, and i choose the wsl to compile it and i use the ide is VSCode. So follow your advice, which part should i update version for compiling success?

foldl commented 2 weeks ago

FYI: My WSL testing env:

gcc -v
Using built-in specs.
.....
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
GentleYo commented 2 weeks ago

I run the cmd 'gcc -v' in the wsl, it seem like we are the same version

..... Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)

Oh, i can't understand what happen in this compile error, what should we do next step for solving?

foldl commented 2 weeks ago

Oh, just fixed. It's OK now. Thanks for reporting.

GentleYo commented 2 weeks ago

Thanks too, solve this important problem to me. By the way, do you have account for communication( wechat or email or else), i feel that you are a good research and engineer, so i really want follow your project and study. I understand if it's inconvenient.

foldl commented 2 weeks ago

Glad that this project helps.