lawliet0823 commented 7 months ago

I referenced the ensemble_image_client.cc code and attempted to use a single request for multiple image inferences. However, I encountered a segmentation fault.

Here is my code:

void clientSendImages(const std::vector<cv::Mat> images, std::string model_name_, std::string model_version_) {

const std::string url("0.0.0.0:8001");
const std::string model_name(model_name_);

std::unique_ptr<tc::InferenceServerGrpcClient> grpcClient;

FAIL_IF_ERR(tc::InferenceServerGrpcClient::Create(&grpcClient, url, false), "Error creating grpc client");

tc::InferInput *input;
const std::string input_name = "images";
const std::vector<int64_t> input_shape = {10, 3, 640, 640};

FAIL_IF_ERR(tc::InferInput::Create(&input, input_name, input_shape, "FP32"), "Error: Creating input failed!");
std::shared_ptr<tc::InferInput> input_ptr(input);

tc::InferRequestedOutput* output;
FAIL_IF_ERR(tc::InferRequestedOutput::Create(&output, "output0", 0), "Error: Creating output failed!");
std::shared_ptr<tc::InferRequestedOutput> output_ptr(output);

std::vector<tc::InferInput*> inputs = {input_ptr.get()};
std::vector<const tc::InferRequestedOutput*> outputs = {output_ptr.get()};

for (const auto &image : images)
{
    std::vector<float> image_data = preprocess(image);

    FAIL_IF_ERR(input_ptr->AppendRaw(reinterpret_cast<const uint8_t *>(&image_data[0]), image_data.size() * sizeof(float)), "Error: Setting input data failed!");
}

tc::InferOptions common_options(model_name_);
common_options.model_version_ = model_version_;
common_options.client_timeout_ = 0;

tc::InferResult *results;
FAIL_IF_ERR(grpcClient->Infer(&results, common_options, inputs, outputs),
            "unable to run model");

std::unique_ptr<tc::InferResult> results_ptr;
results_ptr.reset(results);

}

I intend to process 10 images using the preprocess function (which has been tested successfully for single image). I'm wondering where the issue might be stemming from.

kthui commented 7 months ago

Hi @lawliet0823, is the segmentation fault happening every time you run the code? Can you use some tools (i.e. gdb) to print the stack trace when the segmentation fault happens?

lawliet0823 commented 7 months ago

The problem occurs every time.

This is the stack trace printed out by GDB:

Thread 1 "grpc_client" received signal SIGSEGV, Segmentation fault. __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:317 317 ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory. (gdb) backtrace

0 __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:317

1 0x00007ffff68b7b23 in std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_append(char const*, unsigned long) ()

from /lib/x86_64-linux-gnu/libstdc++.so.6

2 0x00007ffff7846490 in triton::client::InferenceServerGrpcClient::PreRunProcessing(triton::client::InferOptions const&, std::vector<triton::client::InferInput, std::allocator<triton::client::InferInput> > const&, std::vector<triton::client::InferRequestedOutput const, std::allocator<triton::client::InferRequestedOutput const> > const&) ()

from /home/levi/Program/home-C/lib/libgrpcclient.so

3 0x00007ffff7847928 in triton::client::InferenceServerGrpcClient::Infer(triton::client::InferResult**, triton::client::InferOptions const&, std::vector<triton::client::InferInput, std::allocator<triton::client::InferInput> > const&, std::vector<triton::client::InferRequestedOutput const, std::allocator<triton::client::InferRequestedOutput const> > const&, std::map<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::less<std::cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > > const&, grpc_compression_algorithm) () from /home/levi/Program/home-C/lib/libgrpcclient.so

4 0x000055555555be02 in clientSendImages(std::vector<cv::Mat, std::allocator >, std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::cxx11::basic_string<char, std::char_traits, std::allocator >) ()

5 0x0000555555557b91 in processVideo(std::__cxx11::basic_string<char, std::char_traits, std::allocator >) ()

6 0x000055555555c266 in main ()

kthui commented 7 months ago

I wonder if you still get the segmentation fault if you run the ensemble_image_client.cc directly via command line? If not, maybe there is a difference in the inputs passed into the triton::client::InferenceServerGrpcClient::Infer() function between the client and your code?

lawliet0823 commented 7 months ago

My updated code is as follows.

void clientSendImages(const std::vector<cv::Mat> images, std::string model_name_, std::string model_version_) 
{
    const std::string url("0.0.0.0:8001");
    std::unique_ptr<tc::InferenceServerGrpcClient> grpcClient;
    tc::Error err;

    FAIL_IF_ERR(tc::InferenceServerGrpcClient::Create(&grpcClient, url, false), "Error creating grpc client");

    tc::InferOptions options(model_name_);
    options.model_version_ = "1";

    tc::InferInput *input;
    const std::string input_name = "images";
    const std::vector<int64_t> input_shape = {10, 3, 640, 640};
    FAIL_IF_ERR(tc::InferInput::Create(&input, input_name, input_shape, "FP32"), "Error: Creating input failed");

    tc::InferRequestedOutput *output;
    // Set the number of classification expected
    err = tc::InferRequestedOutput::Create(&output, "output0");
    if (!err.IsOk())
    {
        std::cerr << "unable to get output: " << err << std::endl;
        exit(1);
    }
    std::shared_ptr<tc::InferRequestedOutput> output_ptr(output);

    std::shared_ptr<tc::InferInput> input_ptr(input);
    std::vector<tc::InferInput *> inputs = {input_ptr.get()};
    std::vector<const tc::InferRequestedOutput *> outputs = {output_ptr.get()};

    std::vector<std::vector<uint8_t>> image_data;
    // FAIL_IF_ERR(input_ptr->Reset(), "Reset Failed!!!");
    for (int index = 0; index < 10; index++)
    {
        image_data.emplace_back();
        Preprocess(images[index], cv::Size(640, 640), &(image_data.back()));
    }

    for (int index = 0; index < 10; index++)
    {
        FAIL_IF_ERR(input_ptr->AppendRaw(image_data[index]), "AppendRaw Failed!!!");
    }

    tc::InferResult *results;
    FAIL_IF_ERR(grpcClient->Infer(&results, options, inputs, outputs), "Fail to get results");
    std::unique_ptr<tc::InferResult> results_ptr;
    results_ptr.reset(results);

    Postprocess(std::move(results_ptr), 10, images);
}

No segmentation fault occurred. My post-processing code is as follows.

void Postprocess(
    const std::unique_ptr<tc::InferResult> result,
    const size_t batch_size, const std::vector<cv::Mat> images)
{
    std::string output_name("output0");
    const int rows = 8400;
    const int dimensions = 84;

    if (!result->RequestStatus().IsOk())
    {
        std::cerr << "inference failed with error: " << result->RequestStatus()
                  << std::endl;
        exit(1);
    }

    // Get and validate the shape and datatype
    std::vector<int64_t> shape;
    FAIL_IF_ERR(result->Shape(output_name, &shape), "unable to get shape ");
    printf("shape: %ld %ld %ld\n", shape[0], shape[1], shape[2]);

    std::string datatype;
    FAIL_IF_ERR(result->Datatype(output_name, &datatype), "unable to get datatype");

    const uint8_t *rawData = nullptr;
    size_t byteSize = 0;

    FAIL_IF_ERR(result->RawData(output_name, &rawData, &byteSize), "Error: Unable to get output data for tensor");

    if (byteSize != 10 * rows * dimensions * sizeof(float))
    {
        std::cerr << "Unexpected byteSize: " << byteSize << std::endl;
        exit(1);
    }

    int numGroups = 10;
    size_t groupSize = rows * dimensions * sizeof(float); 
    for (int i = 0; i < numGroups; ++i)
    {
        const uint8_t *groupData = rawData + i * groupSize;
        size_t groupByteSize = groupSize;

        std::vector<int> classIds;
        std::vector<float> confidences;
        std::vector<cv::Rect> boxesAfterNMS;

        processDetectionResults(groupData, groupByteSize, classIds, confidences, boxesAfterNMS);

        printf("Detect objects: %ld\n", boxesAfterNMS.size());
    }
}

I am not sure if my post-processing code is correct. The detection results extracted are all zero. Is there an error in the way data is extracted from rawData? The shape and datatype values are correct.

triton-inference-server / server

Encountering a segmentation fault issue when attempting to send multiple images via gRPC #6891

0 __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:317

1 0x00007ffff68b7b23 in std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_append(char const*, unsigned long) ()

4 0x000055555555be02 in clientSendImages(std::vector<cv::Mat, std::allocator >, std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::cxx11::basic_string<char, std::char_traits, std::allocator >) ()

5 0x0000555555557b91 in processVideo(std::__cxx11::basic_string<char, std::char_traits, std::allocator >) ()

6 0x000055555555c266 in main ()