triton-inference-server / dali_backend

The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
MIT License
120 stars 28 forks source link

Add sequence splitting option #161

Closed banasraf closed 1 year ago

banasraf commented 1 year ago

Signed-off-by: Rafal rbanas@nvidia.com

This PR adds support for a new parameter to DALI backend. The parameter is "split_outer_dim" and it lets split output of DALI pipeline along the outer axis. So if the output is a batch of 2 samples of shapes:

{(2, 300, 300), (3, 300, 300)}

will be reshaped to a batch of 5 samples:

5x{(300, 300}

This can be used to split sequences into sub-sequences (in cooperation with reshape operator`) or to split sequences into batch of images.

The usage looks like this in config file:

parameters [
   {
      key: "split_outer_dim",
      value: {string_value: "OUTPUT1:OUTPUT2"}  # list of outputs to split
   }
]

Implementation required to changes:

dali-automaton commented 1 year ago

CI MESSAGE: [6367079]: BUILD STARTED

dali-automaton commented 1 year ago

CI MESSAGE: [6367079]: BUILD FAILED

dali-automaton commented 1 year ago

CI MESSAGE: [6375878]: BUILD STARTED

dali-automaton commented 1 year ago

CI MESSAGE: [6375878]: BUILD FAILED

szalpal commented 1 year ago

How about changing the name of the feature? I feel that something around join fits more, than split. For example:

  1. When we have a "reshape" like: [ (2, 480, 640, 3), (3, 480, 640, 3)] --> [ (5, 480, 640, 3) ], then what we in fact do is to join the samples along outer-most dimension.
  2. Analogously, when we have [ (5, 480, 640, 3) ] --> [ (2, 480, 640, 3), (3, 480, 640, 3)], we split the batch of samples along outer-most dim.

Looking into numpy API, np.concatenate looks like something we do here: https://numpy.org/doc/stable/reference/generated/numpy.concatenate.html. There's also np.flatten, although it has this specific meaning, that it flattens always to 1-D array.

So how about naming this is the flavour of concatenate? E.g. cat_outer_dims or concat_outer_dims? I'd vote for the former, but it's just a personal preference, concat is also totally fine.

dali-automaton commented 1 year ago

CI MESSAGE: [6413229]: BUILD STARTED

dali-automaton commented 1 year ago

CI MESSAGE: [6413229]: BUILD FAILED

dali-automaton commented 1 year ago

CI MESSAGE: [6413700]: BUILD STARTED

dali-automaton commented 1 year ago

CI MESSAGE: [6413700]: BUILD FAILED

dali-automaton commented 1 year ago

CI MESSAGE: [6426023]: BUILD STARTED

dali-automaton commented 1 year ago

CI MESSAGE: [6427306]: BUILD STARTED

dali-automaton commented 1 year ago

CI MESSAGE: [6428145]: BUILD STARTED

dali-automaton commented 1 year ago

CI MESSAGE: [6428145]: BUILD FAILED

dali-automaton commented 1 year ago

CI MESSAGE: [6428720]: BUILD STARTED

dali-automaton commented 1 year ago

CI MESSAGE: [6429057]: BUILD STARTED

dali-automaton commented 1 year ago

CI MESSAGE: [6427306]: BUILD FAILED

dali-automaton commented 1 year ago

CI MESSAGE: [6428720]: BUILD FAILED

dali-automaton commented 1 year ago

CI MESSAGE: [6429057]: BUILD FAILED

dali-automaton commented 1 year ago

CI MESSAGE: [6440460]: BUILD STARTED

dali-automaton commented 1 year ago

CI MESSAGE: [6440460]: BUILD PASSED

dali-automaton commented 1 year ago

CI MESSAGE: [6442331]: BUILD STARTED

dali-automaton commented 1 year ago

CI MESSAGE: [6442398]: BUILD STARTED

dali-automaton commented 1 year ago

CI MESSAGE: [6442398]: BUILD PASSED

dali-automaton commented 1 year ago

CI MESSAGE: [6557874]: BUILD STARTED

dali-automaton commented 1 year ago

CI MESSAGE: [6557874]: BUILD FAILED

dali-automaton commented 1 year ago

CI MESSAGE: [6557874]: BUILD PASSED