compile error：ppl.llm.serving/tools/client_pressure.cc:339:105: error: no matching function for call to ‘std::unique_ptr<grpc::ClientReader<ppl::llm::proto::Response> >::unique_ptr(std::unique_ptr<grpc::ClientReader<ppl::llm::proto::BatchedResponse> >)’

Zhiy-Zhang commented 3 months ago

What are the problems?(screenshots or detailed error messages)

compile error：

What are the types of GPU/CPU you are using?

GPU：A100-80G-SXM4

What's the operating system ppl.llm.serving runs on?

Ubuntu 20.04.4 cuda：12.3 cudnn：8904 trt：9.2.0

What's the compiler and its version?

gcc 11.4 cmake version 3.27.9 Cuda compilation tools, release 12.3, V12.3.107

Which version(commit id or tag) of ppl.llm.serving is used?

commit id：c2bf8614ea7bce0cb9838255fb3cd6ab9d75039b

What are the commands used to build ppl.llm.serving?

./build.sh -DPPLNN_USE_LLM_CUDA=ON -DPPLNN_CUDA_ENABLE_NCCL=ON -DPPLNN_ENABLE_CUDA_JIT=OFF -DPPLNN_CUDA_ARCHITECTURES="'80;86;87'" -DPPLCOMMON_CUDA_ARCHITECTURES="'80;86;87'"

What are the execution commands?

None

minimal code snippets for reproducing these problems(if necessary)

None

models and inputs for reproducing these problems (send them to openppl.ai@hotmail.com if necessary)

None

Vincent-syr commented 3 months ago

Thanks for your issue, the problem has been solved in latest commit. Wish you good luck

Zhiy-Zhang commented 3 months ago

Thanks for your issue, the problem has been solved in latest commit. Wish you good luck

thanks for your reply, but there is another problem when i compiling on 2080Ti(compute capability: 75). building command: ./build.sh -DPPLNN_USE_LLM_CUDA=ON -DPPLNN_CUDA_ENABLE_NCCL=ON -DPPLNN_ENABLE_CUDA_JIT=OFF -DPPLNN_CUDA_ARCHITECTURES="'75'" -DPPLCOMMON_CUDA_ARCHITECTURES="'75'"

compiled error:

Vincent-syr commented 3 months ago

We only support gpu architecture >= sm_80, due to flash attention2's requirements. Maybe you could change to A100 GPU and try again

Zhiy-Zhang commented 3 months ago

We only support gpu architecture >= sm_80, due to flash attention2's requirements. Maybe you could change to A100 GPU and try again

thanks for your advace, maybe i do not have to compile flash attentionv2. Can you tell me how can i change the compile file to solve this problem.

openppl-public / ppl.llm.serving