0x09 / resdet

Detect source resolution of upscaled images
GNU Lesser General Public License v2.1
234 stars 9 forks source link

Can this be ported as an FFmpeg filter, similar to cropdetect? #6

Open Brainiarc7 opened 5 years ago

Brainiarc7 commented 5 years ago

Hello there,

Can this be ported as an FFmpeg filter (dependent on the library implementation libresdet) to offer functionality for resolution up-scaling detection?

FFmpeg has a filter named cropdetect that can automatically detect crop size and print out recommended parameters to the logging system. Perhaps such an example would be a good start.

0x09 commented 5 years ago

This is a neat idea, but I'd like to know a bit more about the use case. At least with the sort of content I'm familiar with, upscaling will usually be applied the same to the entire video, so a frame-by-frame analysis isn't necessarily helpful (though same might be said for cropping). What would this offer in vf form vs something like ffmpeg ... -vframes x -f yuv4mpegpipe - | resdet -t video/yuv4mpeg -?

Brainiarc7 commented 5 years ago

@0x09 a good example would be a video filter capable of detecting upscaled video. In a similar fashion as the framemd5 and hash muxers that can be used to perform quality checks without needing complete binary comparisons, such a filter could be used to log out the confirmation status of upscaled video, etc. Typically, an application for this would be for short, frame-based analysis of encoded source material to confirm the presence of up-scaling as an artifact.

0x09 commented 5 years ago

@Brainiarc7 it wasn't too hard to add in some support for this.

https://gist.github.com/0x09/5417ddeb1c80ad52acb69691aca3353a

On the resdet side: ./configure --disable-everything && make && make install-lib

Then with this patch applied to ffmpeg, running ./configure --enable-libresdet and building will add a resdetect filter.

For example ffmpeg -i Lenna.png -vf scale=1024:1024,resdetect -f rawvideo -y /dev/null Outputs: [Parsed_resdetect_1 @ 0x7f828ae00100] w: 512, h: 512

This takes the same method, range and threshold arguments the CLI does as filter options. For now it just prints the top result per frame, but there's room for improvement in the output and quite a bit could be done to make this kind of use more efficient, which is why I'm posting just a gist patch for now.

edit: also this first attempt just blindly passes the frame data into resdet as-is, which is quick but won't work with frames with extra horizontal padding. This is easy enough to add support for though, the detection methods internally actually can handle that sort of thing already. Updated to handle the case of padded data in the less efficient way (copying into a non-padded buffer) but this seems uncommon.