jolibrain / deepdetect

Deep Learning API and Server in C++14 support for Caffe, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE
2.52k stars 560 forks source link

Tensorrt get whole output distributions #718

Closed YaYaB closed 4 years ago

YaYaB commented 4 years ago


Your question / the problem you're facing:

I have an issue getting the whole distribution of predictions using Tensorrt. I took the model available on dd's website and named age_real

Error message (if any) / steps to reproduce the problem:

Api call

./dede --port 8080

Serveur log output

DeepDetect [ commit 6d6c79aaf43171a93dba38ba79ac5f0207f21c71 ]
[2020-03-31 17:39:39.638] [api] [info] Running DeepDetect HTTP server on localhost:8080
Serveur log output


- Create Prediction

Api call

curl -X POST "http://localhost:8080/predict" -d '{ "service":"age", "parameters":{ "input":{ "width":224, "height":224 }, "output":{ "best": -1 }, "mllib":{ "gpu": true, "gpuid":0 } }, "data":[""] }'

Serveur log output:


Here I got the whole distribution of the predictions using the flag "" to -1.
Now let's try to do the same with Tensorrt v5.1.

- Launch Dede

Api call

./dede --port 8080

Serveur log output

DeepDetect [ commit 6d6c79aaf43171a93dba38ba79ac5f0207f21c71 ] [2020-03-31 17:39:39.638] [api] [info] Running DeepDetect HTTP server on localhost:8080

- Create service
Api call

curl -X PUT "http://localhost:8080/services/age" -d '{ "mllib":"tensorrt", "description":"object detection service", "type":"supervised", "parameters":{ "input":{ "connector":"image", "height": 224, "width": 224 }, "mllib":{ "datatype": "fp32", "maxBatchSize": 1, "maxWorkspaceSize": 6096, "tensorRTEngineFile": "TRTengine_bs", "gpuid":0 } }, "model":{ "repository":"/mnt/terabox/research/age-classification/models/yaya/age" } }'

Serveur log output


- Create Prediction

Api call

curl -X POST "http://localhost:8080/predict" -d '{ "service":"age", "parameters":{ "input":{ "width":224, "height":224 }, "output":{ "best": -1 }, "mllib":{ "gpu": true, "gpuid":0 } }, "data":[""] }'

Serveur log output:


As you can see I get empty prediction. However If I remove the "" I get the best_match.


Now if I try putting "" to 1 or another value here is what I get a result very different with the category 0 with a low probability:


I would like to get the whole distribution but it seems that the element "" does not work as it should.
beniz commented 4 years ago

Can you try "best":0 ? I believe we have a wrong test against 0 instad of < 0.

beniz commented 4 years ago

Actually the pathway is wrong in @fantes maybe I can take this, it should go through the supervised connector instead.

fantes commented 4 years ago

what do you mean "the pathway is wrong" ?

fantes commented 4 years ago

you mean it should not be filtered in tensorrlib and instead the supervisedouputconnector should do it?

fantes commented 4 years ago

in this case it seems the only thing to do is to remove code from tensorrtlib, i can handle it, i am tired of fighting against torch/c++ :)

YaYaB commented 4 years ago

Can you try "best":0 ? I believe we have a wrong test against 0 instad of < 0.

Yep it gives empty prediction

fantes commented 4 years ago

Hi this should be fixed by : @YaYaB thank you a lot for the very precise bug report, it helps a lot for testing :)

YaYaB commented 4 years ago

Great, anytime :) I'll test it tonight and close the issue if it resolves everything on my side!

YaYaB commented 4 years ago

It fixes the issue on my side (tried with best equals to -1, 1 and several larger values)