Closed Mmdixon closed 5 years ago
Hi, thanks for reporting the issue. I will double check that, but I think our current implementation is not supporting variable frame rates. I will get back to you on this.
Hi, Could you tell if this always reproduces on any video or only selected one? Can you provide some simple self-contained script and video sample that reproduces this problem?
The video sample is a bit contrived, but an easy way to create a variable frame rate video with limited tools and should reproduce the issue. Create the video source:
#!/bin/bash
# Create a 3 second video at 60fps at constant frame rate.
# 180 frames, with a time delta of 1/60ms. Duration 3s.
ffmpeg -f lavfi -i color=c=blue:s=1280x720:d=3:r=60 \
-c:v libx264 \
-vf "format=pix_fmts=yuv420p, drawtext=fontsize=64: fontcolor=white: font=monospace: x=(w-text_w)/2: y=(h-text_h)/2: r=60: text='%{frame_num}'" \
cfr_test.mp4
# Transcode video to 25fps at variable frame rate.
# 180 frames, time deltas spaced between 1/30ms and 1/20ms. Duration ~7s.
ffmpeg -i cfr_test.mp4 -vsync vfr -vf setpts='N/(25*TB)' vfr_test.mp4
Run the Pipeline:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from nvidia.dali.pipeline import Pipeline
import nvidia.dali.ops as ops
class VideoPipe(Pipeline):
def __init__(self, device_id, filenames, batch_size=2, sequence_length=60, num_threads=2):
super().__init__(batch_size=batch_size, num_threads=num_threads, device_id=device_id)
self.video = ops.VideoReader(device="gpu", filenames=filenames,
sequence_length=sequence_length)
def define_graph(self):
output = self.video(name="Reader")
return output
if __name__ == "__main__":
device_id = 0
filenames = ["vfr_test.mp4"]
pipeline = VideoPipe(device_id, filenames)
pipeline.build()
result = pipeline.run()
print(result)
Prints fine with cfr_test.mp4
, hangs on vfr_test.mp4
. ctrl+c
doesn't interrupt but can terminate with ctrl+\
Thanks for fast and good repro. I was able to do this on my machine and it behaves exactly as you described. As I've mention before, currently we do not support variable frame rates in our reader. @JanuszL pointed out exact reason for the hang you are experiencing: we rely on constant frame rate to correctly identify the frame and with variable frame rate we end up waiting indefinitely for the data (link) Could you formulate some more detailed requirements of what would you need? Or even better, are you willing to implement this yourself as a contribution to DALI? If you are, we are happy to provide some guidance. I think the place to start is nvdecoder code. For now we will track this internally as DALI-951
By requirements @awolant means what would you expect to get when the video is VFR? When you want to get the 2nd frame you want to get it no matter what is the time between 1st and 2nd, or assume that video has some fixed frame rate and interpolate 2nd frame from real 1st and 2nd one.
I guess my first expectation would be the process not hang, if VFR is not supported then maybe throw an exception?
Given that a VideoReader
pulls frames by sequence_length
, I would expect it to read all the frames like your first suggestion and just ignore the temporal information.
The workaround for this is not so bad because you can edit the PTS/DTS timecodes of a video container quickly to get CFR without re-encoding the video and everything works.
What I would find more interesting with a VFR video is to pull frames by a time sequence, e.g. Get all the frames that occur within a 1 second interval. This would probably mean the batch tensor couldn't be dense (since different sequence lengths), but would be dense for the CFR case.
Or could keep the fixed sequence_length and have a time-based sub-sampling, e.g. grab 60 frames 1/60ms apart with duplicate/decimate behavior to meet these requirements. So if you have a really long duration frame (longer than sub-sample interval) the video reader will keep sampling it, and if you have multiple really short frame durations (shorter than the sub-sample interval) then some of those frames will be skipped. This would be ideal in the context of recurrent networks (like a CNN feed into a LSTM) as most of these architectures only work with fixed time intervals/deltas and you won't be ignoring the temporal information. The workaround for this case requires re-encoding the video at CFR to essentially bake-in the variable time differences between frames. This takes longer (although NVENC does give a nice speedup) and there is worry about lossy [re]transcoding; plus the fact that features extracted by convolutions are very sensitive to compression artifacts.
I was thinking VFR interpolation as picking the floor/ceiling/nearest frame strategy. But as your second suggestion, it might be interesting for neural networks to interpolate as a mix of pixels between the real frames by some strategy (linear/cubic/sigmoid, etc), as that would have better differentiable properties.
Proper error message was added and merged today in #1067 You can out check next DALI nightly build. Closing the issue for now.
Hi @awolant , I'm still experiencing the Hang Issue. I'm experiencing the issue when I read the .avi
file. I'm using nvidia-dali version
Version: 0.17.0
Summary: NVIDIA DALI for CUDA 9.0. Git SHA: e61c304d9f5560fff1be5c821ee140cdab104aef
The process hang indefinitely unless the process is killed. It gives no error message or anything but hang forever.
This is my code below, I defined my VideoPipeline
as follows:
from nvidia.dali.pipeline import Pipeline
import nvidia.dali.ops as ops
import nvidia dali.types as types
class VideoReaderPipeline(Pipeline)
def __init__(self, filenames, batch_size, sequence_length, crop_size,
num_threads, device_id, output_layout=types.NCHW,
random_shuffle=True, step=-1, seed=42):
super().__init__(batch_size, num_threads, device_id, seed=seed)
# Define video reader
self.reader = ops.VideoReader(device = "gpu",
filenames = filenames,
sequence_length = sequence_length,
normalized = False,
random_shuffle = random_shuffle,
image_type = types.RGB,
dtype = types.UINT8,
step = step,
initial_fill = 16)
self.cropnorm = ops.CropMirrorNormalize(device = "gpu",
seed = seed,
crop = crop_size,
output_dtype = types.FLOAT,
output_layout = types.NFCHW)
def define_graph(self):
""" Definition of graph-event that defines flow of video pipeline
"""
input_vid = self.reader(name = "Reader")
output_vid = self.cropnorm(input_vid)
return output_vid
Then I run the above pipeline with a simple script:
import nvidia
import nvidia.dali.ops as ops
import nvidia.dali.types as types
from nvidia.dali.pipeline import Pipeline
from nvidia.dali.plugin.pytorch import DALIGenericIterator
from torchaction.dataloader import VideoReaderPipeline
def main():
filenames = ["data/HMDB51/raw/temp/video_one.avi",]
if not os.path.isfile(filenames[0]):
raise FileNotFoundError("salah coy: %s" % filenames)
# Define video reader pipeline
pipeline = VideoReaderPipeline(filenames,
batch_size = 2,
sequence_length = 16,
crop_size = (224,224),
num_threads = 2,
device_id = 0,
random_shuffle = True,
step = 2)
# Build pipeline
pipeline.build()
print("Building pipeline...")
for k in range(10):
print("Running pipeline")
pipeout = pipeline.run()
sequence_out = pipeout[0].as_cpu().as_array() # [batch_size, sequence_length, channel, height, width]
print("Result:", sequence_out.shape) # printing result
video_one.avi
, video_four.avi
video_two.avi
, video_three.avi
filenames = "video_one.avi"
or "video_four.avi"
to
VideoReaderPipeline
works and printed "Result..."filenames = "video_two.avi"
to VideoReaderPipeline
hangs indefinitelyfilenames = ["video_one.avi", "video_four.avi"]
hangs indefinitely somehowfilenames = ["video_one.avi", "video_one.avi"]
also hangs indefinitely somehow@awolant do you mind investigating the issue here? or I can push a different issue if that's more convenient.
I can also upload the four .avi
video, if you need to reproduce them. It was taken from HMDB51 dataset
1 Passing filenames = <any of those videos>
to
VideoReaderPipeline
hangs indefnitely.
video_one.avi
and video_four.avi
, it first printed "Results.." for first 10 iterations of pipeline.run()
(so first couple of sequence) but as we increase the number of iterations to say 15 or 20 iterations, it hangs again. (updated). filenames = [<two or more of those videos>]
also hangs indefinitely@BlackPepperAPI - yes, please upload the files so we can run the repro locally.
@a-sansanwal - FYI
Hi @JanuszL @a-sansanwal
Here is my videos that I use for the above observation. Please cross check and refer to my updated observation. I really appreciate your time taking the teddious action of testing with me.
Inside the videos.zip
are four (4) videos with .avi
format.
video_one.avi
video_two.avi
video_three.avi
video_four.avi
Attachment: videos.zip
It turns out all of those .avi
videos result in a hang indefinitely. The video_one.avi
hangs if I increase the iteration number. The detailed explaination:
1 Passing filenames = "<any of those video paths>"
to
VideoReaderPipeline
hangs indefnitely.
filenames = "video_one.avi"
or "video_four.avi"
, it first printed Results: np.array([N, F, C, H, W])
for first 10 iterations of pipeline.run()
(so first 10 batch of sequences) but as we increase the number of iterations to say 15 or 20 iterations, it hangs indefinitely. Perhaps the video closed? (updated)filenames = ["<two or more of those videos>",]
also hangs indefinitely in the very first iteration. So unlike the hang in number 2, it didn't even print Results: ....
before it hangs. When I checked, it actually contains a video, with 25 fps
for about several seconds. I still use the same batch_size
and sequence length
FYI.I suspect that (I'm not an expert so take it with pinch of salt):
.avi
might not be compatible or bad for data loadingIn summary the workaround kinda solve the hang issue, but I ran into another issue at point 4:
ffmpeg
command in Ubuntu 18.04 to convert those nasty .avi
videos to mp4
, while also converting the frame rate to be constant -r 30
following similar ffmpeg
command here. .avi
as I run ffmpeg
result in a video that still doesn't work when loaded with NVIDIA Dali ops.VideoReader
.filenames = [<mp4 videos>]
. Then I proceed to use plugins.pytorch.DALIGenericIterator
, modify the for-loop a bit. It also read all the frames successfully, printing Results: torch.Size[N, F, C, H, W]
and finished the for-loop..mp4
dismiss this hang issue, but I got another Issue that I commented at [Issue #1637] (https://github.com/NVIDIA/DALI/issues/1637) when I use file_root
or file_list
argument for ops.VideoReader
I used the nightly build for version 0.17.0
as suggested for the above. Thanks!
Hi @BlackPepperAPI The videos you posted have packed b-frames which is an ugly hack used in avi containers. A workaround was already added for this in DALI.
Also according to ffprobe all the videos you posted have this issue where they don't start from 0 timestamp. DALI waits for 1st frame(0.03333 timestamp) which is not present, causing the hang.
ffprobe -show_frames video_one.avi | grep best_effort_timestamp | less
best_effort_timestamp=2
best_effort_timestamp_time=0.066667
best_effort_timestamp=3
best_effort_timestamp_time=0.100000
best_effort_timestamp=4
best_effort_timestamp_time=0.133333
best_effort_timestamp=5
best_effort_timestamp_time=0.166667
best_effort_timestamp=6
best_effort_timestamp_time=0.200000
best_effort_timestamp=7
best_effort_timestamp_time=0.233333
best_effort_timestamp=8
best_effort_timestamp_time=0.266667
best_effort_timestamp=9
best_effort_timestamp_time=0.300000
@BlackPepperAPI Is the dataset huge ? If its possible i would suggest remuxing the videos with correct timestamp or into mp4 container like you suggested you have already tried. Otherwise, I could suggest a way of getting it to run on your dataset with this one line hack. I will also think of ways to fix this without hacks.
diff --git a/dali/operators/reader/loader/video_loader.h b/dali/operators/reader/loader/video_loader.h
index 1ed7113b..68451575 100644
--- a/dali/operators/reader/loader/video_loader.h
+++ b/dali/operators/reader/loader/video_loader.h
@@ -202,7 +202,7 @@ class VideoLoader : public Loader<GPUBackend, SequenceWrapper> {
const auto stream = file.fmt_ctx_->streams[file.vid_stream_idx_];
int frame_count = file.frame_count_;
- int start_frame = 0;
+ int start_frame = 2;
int end_frame = file.frame_count_;
float start = file_info_[i].start_time;
float end = file_info_[i].end_time;
From my observation, video files that are VFR: video_two.avi, video_three.avi
Also I noticed that none of the videos you uploaded were vfr.
Given a video with variable frame rate, i.e.
ffmpeg -i vfr_test.mp4 -vf vfrdet -f null -
reports non-zero, and aPipeline
with only aVideoReader
, callingpipeline.run()
hangs indefinitely and I end up having to kill the python process (doesn't respond to SIGINT).The same video with the times adjusted to constant frame rate works fine.