triton-inference-server / fil_backend

FIL backend for the Triton Inference Server
Apache License 2.0
68 stars 35 forks source link

Support for sparse input data conversion? #195

Open LuBingtan opened 2 years ago

LuBingtan commented 2 years ago

For example, the input data of my xgboost model has 10000 dims. And I want to use Compressed Sparse Row (CSR) data format to send request to triton server.

Is there any help?

FYI, here is my config.pbtxt

name: "xgb"
backend: "fil"
input: <
  name: "input__0"
  data_type: TYPE_FP32
  dims: -1
  dims: 10000
>
output: <
  name: "output__0"
  data_type: TYPE_FP32
  dims: -1
  dims: 2
>
parameters: <
  key: "model_type"
  value: <
    string_value: "xgboost"
  >
>
parameters: <
  key: "output_class"
  value: <
    string_value: "true"
  >
>
parameters: <
  key: "predict_proba"
  value: <
    string_value: "true"
  >
>
parameters: <
  key: "threshold"
  value: <
    string_value: "0.5"
  >
>
wphicks commented 2 years ago

@LuBingtan I'm terribly sorry I didn't see this sooner! No, we do not currently support sparse input. If you're interested in having that available, I can convert this issue to a feature request. Is your primary goal to reduce the overhead of transferring such large input arrays to the server?