Closed italosalgado14 closed 2 years ago
Errata: I have used the following code to infer.
import torch import sys
print("RVM funcionara en: " + sys.argv[1])
model = torch.hub.load("PeterL1n/RobustVideoMatting", "mobilenetv3").cuda() # or "resnet50" convert_video = torch.hub.load("PeterL1n/RobustVideoMatting", "converter")
convert_video( model, # The loaded model, can be on any device (cpu or cuda). input_source=sys.argv[1], # A video file or an image sequence directory. downsample_ratio=None, # [Optional] If None, make downsampled max size be 512px. output_type='video', # Choose "video" or "png_sequence" output_composition='com.mp4', # File path if video; directory path if png sequence. output_alpha="pha.mp4", # [Optional] Output the raw alpha prediction. output_foreground="fgr.mp4", # [Optional] Output the raw foreground prediction. output_video_mbps=4, # Output video mbps. Not needed for png sequence. seq_chunk=12, # Process n frames at once for better parallelism. num_workers=1, # Only for image sequence input. Reader threads. progress=True # Print conversion progress.
)
They are not aligned due to a bug in pyav. The output video may have different fps than the original. For example if your original video is 29.97fps, the output has to be rounded to 30fps. They are not aligned if you playback directly. But if they have the same amount of frames, you should be applying the mask frame by frame.
Thanks for the reply! Indeed my video is almost at 30 fps (29.4), which can cause the error you mention. But I have the same amount of frames both in the mask as in my original video (taking the videos to frames with python). I will preprocess the video to leave it at 30fps and I will comment here on how the solution was if it works for someone else.
Every its OK! Run a ffmpeg command over the original video to set the framerate to 30 fps (not decimal) and its works OK. Thanks!
Hello!. Thanks for the code. I have had some timing issues between the inferred output in the mask compared to the original video. I made this comparison by transforming my original video and the output video from masks to frames. I have obtained the same amount of frames in both processes, so the difference can be caused by a bad configuration of mine. My original video is 30fps and 1080x1920. If you have a suggestion I would appreciate it.