PaddlePaddle / FastDeploy

⚡️An Easy-to-use and Fast Deep Learning Model Deployment Toolkit for ☁️Cloud 📱Mobile and 📹Edge. Including Image, Video, Text and Audio 20+ main stream scenarios and 150+ SOTA models with end-to-end optimization, multi-platform and multi-framework support.
https://www.paddlepaddle.org.cn/fastdeploy
Apache License 2.0
2.93k stars 456 forks source link

使用GPU推理内存显存泄露 #2200

Closed YOU-007 closed 2 weeks ago

YOU-007 commented 1 year ago

环境

#include "fastdeploy/vision.h"
#include <iostream>
#include <chrono>
#include <thread>

#ifdef WIN32
const char sep = '\\';
#else
const char sep = '/';
#endif

void CpuInfer(const std::string& model_dir, const std::string& image_file) {
    std::cout << "Waiting for 5 seconds..." << std::endl;
    std::this_thread::sleep_for(std::chrono::seconds(5)); 
    std::cout << "Done!" << std::endl;
    std::cout << model_dir << typeid(model_dir).name() << std::endl;
    auto model_file = model_dir + sep + "model.pdmodel";
    auto params_file = model_dir + sep + "model.pdiparams";
    auto config_file = model_dir + sep + "infer_cfg.yml";
    auto option = fastdeploy::RuntimeOption();
    option.UseCpu();
    //option.UseGpu();
    option.UseOpenVINOBackend();
    //option.UseOrtBackend();
    //option.UseTrtBackend();
    std::shared_ptr<fastdeploy::vision::detection::PPYOLOE> model = std::make_shared < fastdeploy::vision::detection::PPYOLOE>(model_file, params_file,
        config_file, option);
    /* auto model = fastdeploy::vision::detection::PPYOLOE(model_file, params_file, config_file, option);*/
    if (!model->Initialized()) {
        std::cerr << "Failed to initialize." << std::endl;
        return;
    }

    auto im = cv::imread(image_file);
    fastdeploy::vision::DetectionResult res;

    for (int i = 0; i < 30; i++)
        if (!model->Predict(im, &res)) {
            std::cerr << "Failed to predict." << std::endl;
            return;
        }
    std::cout << "delete!" << std::endl;
    model.reset();
    model = nullptr;
    std::cout << "delete Done!" << std::endl;

    std::cout << "Waiting for 5 seconds..." << std::endl;
        // 在此处cpu推理内存可以释放,gpu推理不能完全释放。
    std::this_thread::sleep_for(std::chrono::seconds(5)); 
    std::cout << "Done!" << std::endl;
}

int main(int argc, char* argv[]) {
    if (argc < 4) {
        std::cout
            << "Usage: infer_demo path/to/model_dir path/to/image run_option, "
            "e.g ./infer_model ./ppyoloe_model_dir ./test.jpeg 0"
            << std::endl;
        std::cout << "The data type of run_option is int, 0: run with cpu; 1: run "
            "with gpu; 2: run with gpu and use tensorrt backend; 3: run with kunlunxin."
            << std::endl;
        return -1;
    }
    CpuInfer(argv[1], argv[2]);
    return 0;
}
jiangjiajun commented 1 year ago

GPU目前确实不支持释放。 如果需要释放,当前的方式是使用子进程加载模型,进程结束后,显存会释放

SchrodingerLLX commented 1 year ago

GPU目前确实不支持释放。 如果需要释放,当前的方式是使用子进程加载模型,进程结束后,显存会释放

那有计划支持释放显存吗?