about deploying several yolov4 models

michaelwithu commented 3 years ago

to deploy different yolov4 models ,I have change one of the layers "YoloLayer_TRT" to "YoloLayer_original_TRT" both in "yolov4-triton-tensorrt/networks/yolov4.h" and "yolov4-triton-tensorrt/layers/yololayer.cu" i change the classes name and namespace "Yolo" to "Yolo_original" "yolov4-triton-tensorrt/layers/yololayer.cu" const char* YoloLayerPlugin_original::getPluginType() const { return "YoloLayer_original_TRT"; }

"yolov4-triton-tensorrt/networks/yolov4.h" auto creator = getPluginRegistry()->getPluginCreator("YoloLayer_original_TRT", "1"); const PluginFieldCollection pluginData = creator->getFieldNames(); IPluginV2 pluginObj = creator->createPlugin("yololayer", pluginData); ITensor inputTensors_yolo[] = {conv138->getOutput(0), conv149->getOutput(0), conv160->getOutput(0)}; auto yolo = network->addPluginV2(inputTensors_yolo, 3, pluginObj);

after these changes,i made two plugins.so plugin_1.so -> model_1 plugin_2.so -> model_2

when use this command to run triton-inference-server docker run --gpus all --rm --shm-size=1g --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v$(pwd)/triton-deploy/models:/models -v$(pwd)/triton-deploy/plugins:/plugins --env LD_PRELOAD=/plugins/plugin_1.so:/plugins/plugin_2.so nvcr.io/nvidia/tritonserver:20.08-py3 tritonserver --model-repository=/models --strict-model-config=false --grpc-infer-allocation-pool-size=16 --log-verbose 1 the server is running. when using the client.py to run the models , i can successfully run model1 while model2 cant give the right detection.

when exchange the place of /plugins/plugin_1.so:/plugins/plugin_2.so. change to "LD_PRELOAD=/plugins/plugin_2.so:/plugins/plugin_1.so" model2 is ok,but model1 is not ok.

i dont have config.pbtxt. what is wrong with my operations.

philipp-schmidt commented 3 years ago

Sorry for the late reply. I believe you did everything correctly - I would have used the same steps.

Can you try to only load a single plugin and check whether the server crashes on start? It should be unable to load one of the two models if one of the two plugins is missing.

philipp-schmidt commented 3 years ago

Hi @michaelwithu Could you check whether loading a single plugin crashes it?

chull434 commented 3 years ago

Hi, I have the same issue, trying to load a yolov3 and yolv4 model into triton and triton seems to only load the first plugin not the second, both plugins work fine as when you flip the order it loads the other model.

I tried raising this on the triton server git repo but didn't get much traction

Haven't been able to prove if its an issue within the plugins and namespaces still overlapping or if tritons bugged and only loads one plugin

Using the tensorrtx repo for building my yolov3 engine

chull434 commented 3 years ago

@philipp-schmidt any ideas how how to debug this issue a bit more?

if there was some way to list all the plugins triton has loaded?

or maybe its a tensorrt issue rather than triton?

chull434 commented 3 years ago

nvm got it to work lol

had another go and complete renamed all the yolo to yolov3 or yolov4 and it now works, must of been a namespace thing or something

greenkarson commented 3 years ago

nvm got it to work lol

had another go and complete renamed all the yolo to yolov3 or yolov4 and it now works, must of been a namespace thing or something

how to rename yolov4 namespace , i changed namespave , but not work

chull434 commented 3 years ago

Everying named YoloPluginCreator , YoloLayerPlugin in 'yololayer.h' and 'yololayer.cu' and from below "YoloLayer_TRT" and "yololayer" in yolo.cpp

auto creator = getPluginRegistry()->getPluginCreator("YoloLayer_TRT", "1");
const PluginFieldCollection* pluginData = creator->getFieldNames();
IPluginV2 *pluginObj = creator->createPlugin("yololayer", pluginData);

So you would have like YoloModelAPluginCreator,YoloModelALayerPlugin in 'yolo_model_a_layer.h' and 'yolo_model_a_layer.cu' etc, or what ever naming pattern you want but yeah rename all those things should work

austingg commented 3 years ago

@michaelwithu You may used the config.pbtxt in the triton inference server for multi instance of the same model and add another model to the model repository for different yolov4 model with different weight but the same box decoder (yololayer.cu)

model_repository/
    -- yolov4_a/
        -- 1/
            -- model.plan
            -- config.pbtxt
    -- yolov4_b/1/model.plan

isarsoft / yolov4-triton-tensorrt

about deploying several yolov4 models #28