Closed GentleYo closed 2 weeks ago
I am not sure if those files have been updated or not.
You can find quantized files for BCE models from here:
https://modelscope.cn/models/judd2024/chatllm_quantized_models/files
@GentleYo, is your Python version < 3.9?
@GentleYo, is your Python version < 3.9?
yes, it's right, i change the environment from py3.8 to py3.9, the convert error is solved, amazing. Which part is the reason of it?
And I'm compile the project use the cmd in the wsl as follow:
cmake -B build cmake --build build -j
then the error happened
the compile error is:
/mnt/d/Codes/RAG/new_pipline/cpp_project/chatllm.cpp/chatllm.cpp/src/backend.cpp:511:52: error: cannot convert ‘int’ to ‘chatllm::ggml::tensor’ {aka ‘ggml_tensor*’} in argument passing 511 | return p->need_observe_tensor_callback(t, p->observe_tensor_callback_data); | ^ |
---|---|---|
int* |
/mnt/d/Codes/RAG/new_pipline/cpp_project/chatllm.cpp/chatllm.cpp/src/backend.cpp:513:47: error: cannot convert ‘int’ to ‘chatllm::ggml::tensor’ {aka ‘ggml_tensor*’} in argument passing 513 | return p->observe_tensor_callback(t, p->observe_tensor_callback_data); | ^ |
---|---|---|
int* |
/mnt/d/Codes/RAG/new_pipline/cpp_project/chatllm.cpp/chatllm.cpp/src/backend.cpp: In member function ‘void chatllm::BackendContext::compute_graph(ggml_cgraph, int)’: /mnt/d/Codes/RAG/new_pipline/cpp_project/chatllm.cpp/chatllm.cpp/src/backend.cpp:539:57: error: invalid conversion from ‘bool ()(int, bool, void)’ to ‘ggml_backend_sched_eval_callback’ {aka ‘bool ()(ggml_tensor, bool, void*)’} [-fpermissive] 539 | ggml_backend_sched_set_eval_callback(sched, _backend_sched_eval_callback, this); | ^ |
---|---|---|
bool ()(int, bool, void*)` |
So what should i do to solve it?
Thanks
Your CPP compiler looks out-dated. Since you are using Windows, yo can just use Visual Studio 2022 Community.
thanks, but i want to compile .so first, and i choose the wsl to compile it and i use the ide is VSCode. So follow your advice, which part should i update version for compiling success?
FYI: My WSL testing env:
gcc -v
Using built-in specs.
.....
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
I run the cmd 'gcc -v' in the wsl, it seem like we are the same version
..... Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
Oh, i can't understand what happen in this compile error, what should we do next step for solving?
Oh, just fixed. It's OK now. Thanks for reporting.
Thanks too, solve this important problem to me. By the way, do you have account for communication( wechat or email or else), i feel that you are a good research and engineer, so i really want follow your project and study. I understand if it's inconvenient.
Glad that this project helps.
using convert.py to quantized bce-reranker-base_v1 failed
cmd and log as follow:
'''''' python3 convert.py -i ./models/bce-reranker-base_v1/ -t q8_0 -o quantized.bin
Traceback (most recent call last): File "convert.py", line 629, in
class TikTokenizerVocab:
File "convert.py", line 638, in TikTokenizerVocab
def bpe(mergeable_ranks: dict[bytes, int], token: bytes, max_rank: Optional[int] = None) -> list[bytes]:
TypeError: 'type' object is not subscriptable
''''''
So how to quantized bce-reranker-base_v1, and i want to only run reranker function in cpp, which part of code should i focus and follow?
Thanks