NVIDIA / DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
Apache License 2.0
5.02k stars 611 forks source link

Experimental video reader unable to read mpeg4 files. #4818

Open bpleshakov opened 1 year ago

bpleshakov commented 1 year ago

Version

1.25

Describe the bug.

When using fn.experimental.readers.video with video files encoded with

  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    encoder         : Lavf58.29.100
  Duration: 00:01:00.03, start: 0.000000, bitrate: 342 kb/s
    Stream #0:0(und): Video: mpeg4 (Simple Profile) (mp4v / 0x7634706D), yuv420p, 455x256 [SAR 1:1 DAR 455:256], 208 kb/s, 30 fps, 30 tbr, 15360 tbn, 30 tbc (default)
    Metadata:
      handler_name    : VideoHandler

You will get following error

RuntimeError: Critical error when building pipeline:
Error when constructing operator: experimental__readers__Video encountered:
[/opt/dali/dali/operators/reader/loader/video/frames_decoder.cc:148] Assert on "i < av_state_->ctx_->nb_streams" failed: Could not find a valid video stream in a file .../tiktok_dataset_old_backup/thepetcollective_6850180418818346246.mp4

I expect for everything works wine.

Files for testing you can take from here

Minimum reproducible example

@pipeline_def
def video_pipe(filenames, sequence_length=sequence_length):
    side_size = 224
    crop_size = 224
    sampling_rate = 4
    random_shuffle = False
    device = 'gpu'

    video, labels = fn.experimental.readers.video(
        device='gpu', filenames=filenames, labels=list(range(len(filenames))), sequence_length=sequence_length,
        random_shuffle=random_shuffle,
        pad_last_batch=False,
        name='Reader', stride=sampling_rate
    )

    video = video / 255

    video = fn.resize(video, size=(side_size, side_size), mode='not_smaller', device='gpu')

    video = fn.crop(video, crop=(crop_size, crop_size),
                    crop_pos_x=0.5, crop_pos_y=0.5, device=device)

    return video, labels

data_folder = '/[path]/[to]/[data]' 
file_names = [
    f'{data_folder}/thepetcollective_6850180418818346246.mp4',
    f'{data_folder}/bigvarn_6925039213402426629.mp4',
    f'{data_folder}/kjbelll_6987115080580107525.mp4',
    f'{data_folder}/josemiguelsal_6840197131358194950.mp4',
    f'{data_folder}/tobyporterart_6838650165688028421.mp4'
]

pipe = video_pipe(batch_size=batch_size, num_threads=1, device_id=2, filenames=file_names)
pipe.build()

Relevant log output

No response

Other/Misc.

No response

Check for duplicates

JanuszL commented 1 year ago

Hi @bpleshakov,

This is one of the limitations of the experimental video decoder. There is a pending PR that was about to enable mpeg4 support for this operator https://github.com/NVIDIA/DALI/pull/4361, however, it is not there yet.