NVIDIA / DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
Apache License 2.0
5k stars 610 forks source link

Decode raw h264 file and convert into mp4 #1352

Open raviy0807 opened 4 years ago

raviy0807 commented 4 years ago

I have a raw video file(lets say raw.h264). I would like to decode and convert into mp4.

I tried to use : ops.VideoReader(device = "gpu", filenames = data, sequence_length = sequence_length, shard_id = 0, num_shards = 1, random_shuffle = shuffle, initial_fill = initial_prefetch_size,channels=3) But it does not read the raw file. I thought of trying ExternalSource, but even if I read with external source pipeline, the dali has in-built decoder in VideoReader. I did not find any ops that I can use to decode after importing file using ExternalSource.

Could anyone help me in find the solution on how to decode the raw h.264 format file. This operation is simple in ffmpeg, but I want to try with dali. The ffmpeg example: ffmpeg.input('raw.h264', format='h264').output('pipe:', format='rawvideo', pix_fmt='uyvy422')

raviy0807 commented 4 years ago

@JanuszL @awolant Any suggestion on this?

JanuszL commented 4 years ago

Hi, Currently it is not possible to read anything beyond the containerized video streams. Video reader is bound to the idea of randomly reading sequences of frames from the video container, and this doesn't align with your use case. Can you tell us more about your use case? Do you work with the stream of video data?

raviy0807 commented 4 years ago

Yes, I have a file which contains the video frames and compressed with h.264. The idea is to convert that raw video file into mp4 format or just decode it. Consider the sequence of frames written in a file and task is to convert that file in meaningful streams.

Based on the examples, I tried to modify for my need. But seems like it is not possible. Code: class ExternalInputIterator(object): def init(self, batch_size): self.images_dir = "test_images/" self.batch_size = batch_size self.files = arr = os.listdir(self.images_dir) shuffle(self.files)

def __iter__(self):
    self.i = 0
    self.n = len(self.files)
    return self

def __next__(self):
    batch = []
    for _ in range(self.batch_size):
        jpeg_filename = self.files[self.i]
        f = open(self.images_dir + jpeg_filename, 'rb')
        batch.append(np.frombuffer(f.read(), dtype = np.uint8))
        self.i = (self.i + 1) % self.n
    return (batch)

next = __next__

eii = ExternalInputIterator(batch_size) iterator = iter(eii)

The actual implementation of the input pipeline

class ExternalSourcePipeline(Pipeline):
def init(self, batch_size, num_threads, device_id): super(ExternalSourcePipeline, self).init(batch_size, num_threads, device_id, seed=12) self.input = ops.ExternalSource() self.decode = ops.VideoReader(device = "gpu", sequence_length=10)

def define_graph(self):
    self.jpegs = self.input()
    output = self.decode(name = "Reader", filenames=self.jpegs)                                                   
    return (output)

def iter_setup(self):
    (images) = iterator.next()
    self.feed_input(self.jpegs, images)

pipe = ExternalSourcePipeline(batch_size=batch_size, num_threads=2, device_id = 0) pipe.build()

JanuszL commented 4 years ago

Currently, DALI is mostly focused on deep learning applications. Your use case seems to be perfectly valid for ffmpeg-python (you can use FFmpeg with nvdec support enabled as well).

raviy0807 commented 4 years ago

ffmpeg works well for the use-case. But idea was to see, whether we can include dali in our DL pipeline? Dali will works for filtered or processed images(existing datasets). However, it would be interesting to see if we can make it work for real-scenario where we get streams at real-time and feed them to our neural network.

Let me know, if that could be feature in near future? Thanks for the answer!

JanuszL commented 4 years ago

Ok, so your use case is to work with the real-time stream of data that you want to put into the DL network (I guess it is about the inference). As I said, it is not supported for now but we are looking into that long term as well. Still, the timeline is not defined yet.

BlackPepperAPI commented 4 years ago

Defintely will vote for this. I have issues with ops.VideoReader similar to Issue #1041 where in the end, I figure out the way to read .avi Variadic Framerate video is to convert to mp4 + change to constant framerate.

If we can support video processing like ffmpeg or ffmpeg-python to ou rDALI pipeline for the use case of Deep Learning, which I have. That would be great!