Open Skier23 opened 3 months ago
Hey @Skier23 Thanks for the question. Indeed it's not possible to provide a filepath to the file reader during runtime, it needs to be provided on the build stage.
Could you tell more about the use-case you have? Why don't you want to just send the files' data instead of filepaths?
My usecase is this: I want to process videos and extract the frames every (for example 0.5s) from the video and then feed all those frames to my image classification model. However, it would probably be a huge bottleneck to feed the video directly to triton over grpc so I was thinking that sending over just the file path and letting DALI read the video file in and process the video from file itself would probably be the most optimized setup.
@Skier23
Unfortunately it's not possible in DALI itself right now.
The only solution that comes to my mind is using another model to read the video files from disk. You can use a Python backend to run a script that would read file (without decoding it) and return it as output. It would be passed through an ensemble to the DALI model that can use video input or video decoder to decode video file from memory.
Here we have an example of using the video decoder in DALI backend: https://github.com/triton-inference-server/dali_backend/tree/main/docs/examples/video_decoding
And the video input: https://github.com/triton-inference-server/dali_backend/tree/main/docs/examples/video_decode_remap
Video input can be used to process the video file part by part (generating multiple responses for a single video file)
That's an approach I was wondering about. On that approach, would reading the whole video in (keeping it encoded) and passing it to the DALI pipeline be more overhead? Or in other words, if I were to read a video in directly from DALI and then I select frames from various timestamps in the video, would DALI load the entire video into memory and then decode it or only load data at the selected time stamps?
I'm looking to do something like this:
However, this code has an error: The argument
files
for operatorFile
should not be aDataNode
but a str or list of strThis seems to be because fn.readers.file doesn't support a DataNode which is returned by external_source. So in this case, how would I get the underlying list of strings that external_source contains to fn.readers.file so it can read in all those images?