openppl-public / ppl.llm.serving

Apache License 2.0
122 stars 13 forks source link

compile error:ppl.llm.serving/tools/client_pressure.cc:339:105: error: no matching function for call to ‘std::unique_ptr<grpc::ClientReader<ppl::llm::proto::Response> >::unique_ptr(std::unique_ptr<grpc::ClientReader<ppl::llm::proto::BatchedResponse> >)’ #56

Closed Zhiy-Zhang closed 2 months ago

Zhiy-Zhang commented 3 months ago

What are the problems?(screenshots or detailed error messages)

compile error: image

What are the types of GPU/CPU you are using?

GPU:A100-80G-SXM4

What's the operating system ppl.llm.serving runs on?

Ubuntu 20.04.4 cuda:12.3 cudnn:8904 trt:9.2.0

What's the compiler and its version?

gcc 11.4 cmake version 3.27.9 Cuda compilation tools, release 12.3, V12.3.107

Which version(commit id or tag) of ppl.llm.serving is used?

commit id:c2bf8614ea7bce0cb9838255fb3cd6ab9d75039b

What are the commands used to build ppl.llm.serving?

./build.sh -DPPLNN_USE_LLM_CUDA=ON -DPPLNN_CUDA_ENABLE_NCCL=ON -DPPLNN_ENABLE_CUDA_JIT=OFF -DPPLNN_CUDA_ARCHITECTURES="'80;86;87'" -DPPLCOMMON_CUDA_ARCHITECTURES="'80;86;87'"

What are the execution commands?

None

minimal code snippets for reproducing these problems(if necessary)

None

models and inputs for reproducing these problems (send them to openppl.ai@hotmail.com if necessary)

None

Vincent-syr commented 3 months ago

Thanks for your issue, the problem has been solved in latest commit. Wish you good luck

Zhiy-Zhang commented 3 months ago

Thanks for your issue, the problem has been solved in latest commit. Wish you good luck

thanks for your reply, but there is another problem when i compiling on 2080Ti(compute capability: 75). building command: ./build.sh -DPPLNN_USE_LLM_CUDA=ON -DPPLNN_CUDA_ENABLE_NCCL=ON -DPPLNN_ENABLE_CUDA_JIT=OFF -DPPLNN_CUDA_ARCHITECTURES="'75'" -DPPLCOMMON_CUDA_ARCHITECTURES="'75'"

compiled error: image

Vincent-syr commented 3 months ago

We only support gpu architecture >= sm_80, due to flash attention2's requirements. Maybe you could change to A100 GPU and try again

Zhiy-Zhang commented 3 months ago

We only support gpu architecture >= sm_80, due to flash attention2's requirements. Maybe you could change to A100 GPU and try again

thanks for your advace, maybe i do not have to compile flash attentionv2. Can you tell me how can i change the compile file to solve this problem.