[serving] Update usage of asio-grpc

Tradias commented 2 years ago

Hi DMetaSoul team,

I have noticed that you are using asio-grpc. I hope you are finding it useful. Let me know if you are facing any issues with it, I am happy to help.

You could make use of some of its more recent features to simplify the code and improve its performance. I also noticed some surprising design with the global GrpcContext which I think is not necessary.

Let me know if you would like me to create a pull request.

codingfun2022 commented 2 years ago

Hi Tradias,

Thank you for the useful asio-grpc open-source project. We appreciate you opening this issue for updating the usage of asio-grpc.

The global GrpcContext exists because the metaspore-serving-bin acts as both servers and clients, and we wanted to use a single shared GrpcContext.

Later we noticed it's recommended using one GrpcContext per thread for performance (https://tradias.github.io/asio-grpc/classagrpc_1_1_grpc_context.html#a977dfa398018fd3d62017f45ef05bc9e), and we planned to make metaspore-serving-bin multi-threaded.

We found the sample codes (https://github.com/Tradias/grpc_bench/blob/master/cpp_asio_grpc_coroutine_bench/main.cpp) useful for making asio-grpc servers multi-threaded. We haven't found samples code which can make asio-grpc clients multi-threaded.

dmetasoul01 commented 2 years ago

Hi @Tradias , asio-grpc is a great project that enables us to build grpc server/client with c++20 coroutine together with boost asio.

In MetaSpore we are building a machine learning model serving service, in which there are many different asynchronous operations like loading files from disk/s3, invoking model inference compute on cpu threadpool or gpu device, interacting with other remote grpc/http services and finally exposing itself as a grpc service. Therefore we choose c++20 coroutine and asio-grpc as basic building blocks. And we find the overall development experience with asio-grpc is quite smooth and productive.

You could make use of some of its more recent features to simplify the code and improve its performance.

We haven't checked out the latest improvements regarding API and performance. Could you point me some doc/code links?

I also noticed some surprising design with the global GrpcContext which I think is not necessary.

As per @codingfun2022 mentioned, we have both grpc server and client in the same process and we'd like to have a separate set of threads for handling grpc networking. Therefore we use a global GrpcContext (and make it multi-threaded in the future) and share it with both server and client. Not sure if this is the idiomatic way to achieve the goal.

Tradias commented 2 years ago

I only had a brief look at your codebase, so here are the things that I have spotted:

Instead of the CoSpawner in https://github.com/meta-soul/MetaSpore/blob/main/cpp/serving/grpc_server.cpp you can use:

    agrpc::repeatedly_request(
        &Predict::AsyncService::RequestPredict, context_->predict_service,
        boost::asio::bind_executor(
            *context_->grpc_context,
            [&](grpc::ServerContext &ctx, PredictRequest &req,
                grpc::ServerAsyncResponseWriter<PredictReply>& writer) -> awaitable<void> {
                auto find_model = ModelManager::get_model_manager().get_model(req.model_name());
                if (!find_model.ok()) {
                    co_await respond_error(writer, find_model.status());
                } else {
                    // convert grpc to fe input
                    std::string ex;
                    try {
                        auto reply_result = co_await(*find_model)->predict(req);
                        if (!reply_result.ok()) {
                            co_await respond_error(writer, reply_result.status());
                        } else {
                            co_await agrpc::finish(writer, *reply_result, grpc::Status::OK,
                                                   boost::asio::use_awaitable);
                        }
                    } catch (const std::exception &e) {
                        // unknown exception
                        ex = e.what();
                    }
                    if (!ex.empty())
                        co_await respond_error(writer, absl::UnknownError(std::move(ex)));
                }
                co_return;
            }));

which can be seen in this example from asio-grpc.

The lines in https://github.com/meta-soul/MetaSpore/blob/main/cpp/serving/py_preprocessing_model.cpp#L136-L144 can be simplified to:

awaitable_result<std::unique_ptr<PyPreprocessingModelOutput>>
PyPreprocessingModel::do_predict(std::unique_ptr<PyPreprocessingModelInput> input) {
    auto output = std::make_unique<PyPreprocessingModelOutput>();
    grpc::ClientContext client_context;
    std::shared_ptr<agrpc::GrpcContext> grpc_context = context_->grpc_context_;
    grpc::Status status;
    const auto reader = agrpc::request(&Predict::Stub::AsyncPredict, client_context, input->request, *grpc_context);
    co_await agrpc::finish(reader, output->reply, status, boost::asio::bind_executor(*grpc_context, boost::asio::use_awaitable));
    if (!status.ok())
        co_return absl::FailedPreconditionError(fmt::format("preprocessing failed: {}", status.error_message()));
    co_return output;
}

That is based on the use of agrpc RPC functions from a different thread within a coroutine, which is explained in the Attention box in the documentation.

How to run a client on multiple threads can be seen in this example: https://github.com/Tradias/example-vcpkg-grpc/blob/asio-grpc-14/client.cpp Although I creating multiple grpc::Channels is really only needed when you have a very high number of threads and contention. In my experiments I have found that using one channel is easily sufficient for up to 12 threads, probably more.

You can of course use one GrpcContext for both servers and clients. I personally avoid global variables as much as possible as they can complicate writing tests and reasoning about code. I would create the grpc::Server and all the GrpcContexts in a central place and then pass each GrpcContext to a register_request_handler function that invokes agrpc::repeatedly_request to set up request handling of the RPC server and wrap the contexts into a selection strategy like RoundRobin for use in the RPC client.

Tradias commented 2 years ago

I have moved the multi-threaded client and server examples into asio-grpc and improved them a little:
https://github.com/Tradias/asio-grpc/blob/master/example/multi-threaded-client.cpp
https://github.com/Tradias/asio-grpc/blob/master/example/multi-threaded-server.cpp

codingfun2022 commented 2 years ago

Hi @Tradias ,

I copied the multi-threaded client example code, but it failed to compile. The multi-threaded server example can compile.

https://github.com/codingfun2022/hello_agrpc/blob/main/cpp/hello_agrpc/hello_agrpc_client.cpp

The following line

agrpc::request(&hello_agrpc::Greeter::Stub::AsyncSayHello, stub, client_context, request, grpc_context);

has no matching call. Does it have something related to the versions of boost::asio or asio-grpc?

Here is the error message.

/home/codingfun2022/Desktop/agrpc-test/hello_agrpc/cpp/hello_agrpc/hello_agrpc_client.cpp: In function ‘boost::asio::awaitable<void> make_request(agrpc::b::GrpcContext&, hello_agrpc::Greeter::Stub&)’:
/home/codingfun2022/Desktop/agrpc-test/hello_agrpc/cpp/hello_agrpc/hello_agrpc_client.cpp:70:23: error: no match for call to ‘(const agrpc::b::detail::RequestFn) (std::unique_ptr<grpc::ClientAsyncResponseReader<hello_agrpc::HelloR
eply> > (hello_agrpc::Greeter::Stub::*)(grpc::ClientContext*, const hello_agrpc::HelloRequest&, grpc::CompletionQueue*), hello_agrpc::Greeter::Stub&, grpc::ClientContext&, hello_agrpc::HelloRequest&, agrpc::b::GrpcContext&)’
   70 |         agrpc::request(&hello_agrpc::Greeter::Stub::AsyncSayHello, stub, client_context, request, grpc_context);
      |         ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/codingfun2022/Desktop/agrpc-test/hello_agrpc/build/vcpkg_installed/x64-linux/include/agrpc/detail/repeatedlyRequest.hpp:27,
                 from /home/codingfun2022/Desktop/agrpc-test/hello_agrpc/build/vcpkg_installed/x64-linux/include/agrpc/repeatedlyRequest.hpp:20,
                 from /home/codingfun2022/Desktop/agrpc-test/hello_agrpc/build/vcpkg_installed/x64-linux/include/agrpc/asioGrpc.hpp:38,
                 from /home/codingfun2022/Desktop/agrpc-test/hello_agrpc/cpp/hello_agrpc/hello_agrpc_client.cpp:31:
/home/codingfun2022/Desktop/agrpc-test/hello_agrpc/build/vcpkg_installed/x64-linux/include/agrpc/rpc.hpp:63:10: note: candidate: ‘template<class RPC, class Service, class Request, class Responder, class CompletionToken> auto agrpc
::b::detail::RequestFn::operator()(agrpc::b::detail::ServerMultiArgRequest<RPC, Request, Responder>, Service&, grpc::ServerContext&, Request&, Responder&, CompletionToken&&) const’
   63 |     auto operator()(detail::ServerMultiArgRequest<RPC, Request, Responder> rpc, Service& service,
      |          ^~~~~~~~
/home/codingfun2022/Desktop/agrpc-test/hello_agrpc/build/vcpkg_installed/x64-linux/include/agrpc/rpc.hpp:63:10: note:   template argument deduction/substitution failed:
/home/codingfun2022/Desktop/agrpc-test/hello_agrpc/cpp/hello_agrpc/hello_agrpc_client.cpp:70:23: note:   mismatched types ‘void’ and ‘std::unique_ptr<grpc::ClientAsyncResponseReader<hello_agrpc::HelloReply> >’
   70 |         agrpc::request(&hello_agrpc::Greeter::Stub::AsyncSayHello, stub, client_context, request, grpc_context);
      |         ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/codingfun2022/Desktop/agrpc-test/hello_agrpc/build/vcpkg_installed/x64-linux/include/agrpc/detail/repeatedlyRequest.hpp:27,
                 from /home/codingfun2022/Desktop/agrpc-test/hello_agrpc/build/vcpkg_installed/x64-linux/include/agrpc/repeatedlyRequest.hpp:20,
                 from /home/codingfun2022/Desktop/agrpc-test/hello_agrpc/build/vcpkg_installed/x64-linux/include/agrpc/asioGrpc.hpp:38,
                 from /home/codingfun2022/Desktop/agrpc-test/hello_agrpc/cpp/hello_agrpc/hello_agrpc_client.cpp:31:
/home/codingfun2022/Desktop/agrpc-test/hello_agrpc/build/vcpkg_installed/x64-linux/include/agrpc/rpc.hpp:92:10: note: candidate: ‘template<class RPC, class Service, class Responder, class CompletionToken> auto agrpc::b::detail::Re
questFn::operator()(agrpc::b::detail::ServerSingleArgRequest<RPC, Responder>, Service&, grpc::ServerContext&, Responder&, CompletionToken&&) const’
   92 |     auto operator()(detail::ServerSingleArgRequest<RPC, Responder> rpc, Service& service,
      |          ^~~~~~~~

Tradias commented 2 years ago

@codingfun2022 Thanks for trying it out. The example has been written based on v1.7.0 of asio-grpc. For older versions replace that line with

const auto reader = stub.AsyncSayHello(&client_context, request, agrpc::get_completion_queue(grpc_context));

If you want to use v1.7.0 of asio-grpc then update the builtin-baseline in your vcpkg.json to at least e2794913924ef606cc7bbc59c173f2714e9853c5.

codingfun2022 commented 2 years ago

@Tradias Thanks. The multi-threaded client example compiles now. I will see how to update the usage of asio-grpc in this repo according to the examples.

meta-soul / MetaSpore

[serving] Update usage of asio-grpc #151