amiaopensource / vrecord

Vrecord is open-source software for capturing a video signal and turning it into a digital file.
https://github.com/amiaopensource/vrecord
149 stars 44 forks source link

Feature Request - Add 4:4:4 as a chroma subsampling option #790

Open marshalleq opened 4 months ago

marshalleq commented 4 months ago

Given this is an archive suite, I was surprised to see no option for a lossless colour capture.

Looking at part of the ffmpeg command generated (pasted below) raw_format yuv422p10.

I am aware that analog sources also have a form of subsampling by limiting signal bandwidth, my concern here though is not around what the source has, but generation loss from using 4:2:2. The original signal has a reduced chroma already, then we reduce it some more using 4:2:2 and then we edit it and reduce it some more, then export it again reducing it etc. The right answer is to limit this reduction at the start, as its captured, not later on down the track.

I think there is a widespread misunderstanding that the subsampling method is akin to one of those children's toys where you put the right letters into the right holes and therefore if we allow 4:2:2 in the container it will provide enough holes for the analog chroma system to fit into - which is dead wrong. (Especially) with low resolution video (and analog consumer low resolution video at that) which already exhibits chroma bleed / drift around the edges etc, 4:2:2 is just making that worse by being a second generation subsample - especially around the already problematic edges. I want to point out here one point of confusion, that the visible loss of 4:2:2 in 4k or 8k or even 1080p video is a lot less than with analog 480i / 576i simply due to the effective size of a pixel in the analog formats - and this is not even taking into account that it's interlaced.

Screenshot 2024-02-18 at 16 41 50

I note two versions of Prores 4:4:4 are provided in blackmagic media express, which can provide an alternative in the interim, though I'd much rather use ffv1.

More info on subsampling: https://en.wikipedia.org/wiki/Chroma_subsampling

I don't think it would be hard to add this simple option into the mix right?

Thanks.

Capture-Record command: /opt/homebrew/opt/ffmpegdecklink/bin/ffmpeg-dl -nostdin -nostats -timecode_format none -loglevel info -f decklink -draw_bars 0 -audio_input embedded -video_input sdi -format_code pal -channels 8 -audio_depth 32 -raw_format yuv422p10 -i UltraStudio Recorder 3G -map_metadata 0:s:v:0 -color_primaries bt470bg -color_trc bt709 -colorspace bt470bg -color_range mpeg -metadata creation_time=now -movflags write_colr -c:v ffv1 -level 3 -g 1 -slices 4 -slicecrc 1 -c:a pcm_s24le -metadata:s:v:0 encoder=FFV1 version 3 -filter_complex [0:v:0]setdar=4/3; [0:a:0]pan=stereo| c0=c0 | c1=c1[stereo1] -map [stereo1] -f mov /Users/username/Tape/videofile_ffv1.mov -an -f framemd5 /Users/username/Tape/videofile_ffv1.framemd5 -c copy -c:a pcm_s24le -map 0 -f matroska -write_crc32 0 -live true -

harrypm commented 4 months ago

Uncompressed 4:2:2 or V210 is what SD SDI is sampled at and its lossless upto full composite video range.

But the thing with ProRes is yes 4444XQ is visually lossless but not lossless compressed due to its encoding and bitrate not due to the 4:4:4, this is why V210 to FFV1 directly is the only sane options considerd for standard video archival outside of just raw CVBS or direct FM RF archives today which removes the whole what digital format argument entirely its a post problem.

Only when dropping to lossy codecs such as ProRes and 4:2:0 / 4:1:1 chroma sampling do you see real world loss of any data but if you remain V210/FFV1 thougout its an non issue and both are already supported by V-Record..

4:4:4 Only makes sence for Hi-Vison/MUSE/HDVS and or HDCAM 4:4:4 media but not standard SD as your just burning more space.

marshalleq commented 4 months ago

So this is the kids toy analogy above right? I don't see how 4:2:2 can ever be lossless when by design it is reducing colour information by 50% EACH PASS AND INCLUDING THE FIRST CAPTURE - which is far more visible in low resolution files like contrasted edges in VHS. For hi res I would agree with you. The right way to do it would be to have a 4:4:4 master with a 4:2:2 export when finished. Besides that, I don't see why we need to argue over a feature request, it's not like it would be hard to implement. One day I will use your nice RF system, but I have already bought all this equipment and only have the use of one arm for a few months so it's not going to happen for a while sadly.

As a bit of a test for myself, I took a first capture and a 2nd, 3rd, 4th capture of a short clip from an analog source and saved each in ffv1 (since you're saaying thats the bees knees) at 4:2:2. The pic shows clip 1 and 4. You can see the loss quite easily. Not that we really need this for a feature request, but it's an interesting addition.

Screenshot 2024-02-19 at 08 53 35

The colours have become a lot more muted, the edges have blurred plus blended and colour has expanded to where it didn't even exist before. Exactly what the theory says will happen. This is in 576i (PAL), so I expect it to be worse in 480i (NTSC). Overall quite a loss.

For completeness here is the same exercise using prores 4:4:4

Screenshot 2024-02-19 at 09 17 19

A lot less of the symptoms seen above.

The exports were done using default settings in Davinci Resolve other than selecting the codec and colourspace settings.

I know which of the two I prefer, but perhaps you'll see it differently. I like a good friendly learning exercise like this. :)

harrypm commented 4 months ago

@marshalleq

The examples are Intresting although unless its a affixed repeatable source and the hardware chain and raw files are provided, its not partically a vaild test in terms of analysis exact encode configs have to be stated.

The TBC error on the bottom half looks the same but the framing is diffrent though.

Also when converting V210 to FFV1 or ProRes that should only be done in FFmpeg directly never via Resolves limited implimentation and also it avoids any possible processing like its poor deinterlacer (I always use QTGMC before fottage hits resolve for actual editing), the pictures on the right looks like they have been deinterlaced by resolve.

dericed commented 4 months ago

Hi @marshalleq,

I don't see how 4:2:2 can ever be lossless when by design it is reducing colour information by 50% EACH PASS AND INCLUDING THE FIRST CAPTURE

4:2:2 doesn't imply that half the color detail is sampled from the source, but only that the sampling of the color (horizontally) is half that of the luma. So with our 720x486 captures for NTSC in vrecord, you get a 720x486 image for the Y channel and two 360x486 images for the Cb and Cr channels. The width of those images (720 for Y and 360 for Cb and Cr) is generally set by the nyquist theorem applied to the resolution in a single line of Y and Cr/Cb data. With many analog video tape formats, the area of tape surface allocated for a Y channel is double that of a Cb or Cr channel, so the selection is 720 for Y and 360 for Cb and Cr (which is 4:2:2) is keeping the sampling of the data consistent with what's on the tape. For some formats, such as VHS, the storage of color channels on the tape is actually much less than half of the storage for the luma channel, so for a format like VHS a sampling of 4:1:1 would be closing to consistent to the data on the tape.

In suppose, in regards to your suggestion, I think that if the Y channel on the tape is double the Cb channel and double the Cr channel, then 4:2:2 is the most authentic to represent that analog data, but if you wanted to increase the resolution (if the application of nyquist there was questioned) then I'd opt for keeping 4:2:2 and increasing the same rate, so for instance making a 1440x486 image with 1440x486 for the Y plane and 720x486 for each of the Cb and Cr planes.

Still, despite a desire to have more control over sampling, at least with the Blackmagic card that vrecord is using, these parameters are fixed by the hardware and its API. Take a look at https://github.com/amiaopensource/SoyDecklink/blob/9afa86f52fe656a28a0a70a2e2311044d1ebbed4/DecklinkSdk/Mac/include/DeckLinkAPI.h#L157-L158 which shows what chroma subsamplings are defined in the API and it's just yuv4:2:2 or rgb4:4:4 and that's it.

Prores is a lossy codec so applying it 100x to a source will create a lot of generation loss. If you did the same test with ffv1 or v210 you'd get lossless results (excepting that v210 encoders often reserve a few of the highest and lowest values and don't use them).

So ultimately, at least with the decklink integration, we could consider adding RGB 4:4:4 which as @harrypm pointed out could be helpful for some formats like HDCam, but I'm not sure there's a reason or a possibility to easily add yuv 4:4:4

marshalleq commented 4 months ago

Thanks for your detailed reply, I am still digesting it a bit to be honest. I think the part you are explaining is actually done by the capture device in it's hardware, at which point it must come out as RGB not YUV, so I think that part is out of our control. My belief is therefore that we're talking about RGB-RGB colour loss, which starts between the capture device and vrecord. I don't think vrecord listens to YUV and converts to RGB does it? Though I suspect that @harrypm has this going on in his awesome (and cant wait to try it) direct from the heads system somewhere - I'm not sure. Seems like I need to read up a bit on analog capturing architecture.

So to that end, my focus is on RGB-RGB loss, though I totally get why my first assumption here is paramount to the feature request - does vrecord convert from YUV or not, if it does the feature may not be valuable, if it does then I'd argue 4:4:4 is still valuable.

Both on the theory I've read and on the results of my testing RGB-RGB not equal to 4:4:4 seems to incur generational loss. But, like all good knowledge more study and testing I shall need to do. Just one note, you seem to be implying my results were bad due to ProRes, but actually it was the ProRes 4:4:4 that showed the least degradation here, with ffv1 4:2:2 being the most degradation. I assume this would be repeatable for anyone wanting to test it themselves.

I will need to do it perhaps 10x to exaggerate the result and use a consistent codec to rule out differences.

In the case of the black magic analog to SDI converter I'm using, it clearly says output is YUV.. In the case of the BM SDI to thunderbolt converter it's less clear. It supports both YUV and RGB and I assume this is for conversion purposes which it only seems to say it has hardware based realtime conversion. So assuming that it's outputting RGB to vrecord and assuming RGB-RGB@4:2:2 incurs loss, then it would be valuable to have 4:4:4. I have a lot of assumptions to validate. :)

Great discussion thanks.

harrypm commented 4 months ago

@marshalleq Okay before I sleep I'll give you a brain dump.

Blackmagic is not magic, decode is not magic heres why lol.

The decode projects chroma-decoder outputs YUV444P16 or RGB48 (we use 4:2:2 10-bit on the standard export FFmpeg encode profiles as thats now a thoughtless hands off process really.)

So really basic breakdown here:

Software Defined:

Head Amplifyer --> FM RF Test Points --> ADC --> RF on file --> TBC/Demod --> 4fsc S-Video/CVBS on file --> Software Chroma Decoder --> YUV or RGB data --> FFmepg Encoded Video.

Hardware Decoding:

S-Video/CVBS Live --> Intergrated ADC --> Hardware Chroma Decoder --> FPGA --> YUV or RGB data stream via SDI/HDMI --> FFmpeg or encoder to make video files.

Blackmagic analog to SDI, just to clarify as its what I use in my refrance capture chain, SD-SDI is 4:2:2 YUV always off the SDI connection (in this context) I dont think the 4:4:4 modes of the ADV7842 chip is even possible on this unit? as its not magewell we all know blackmagic give zero care to open all hardware modes upto the end user, let alone even enable the TBC functions properly and I would love to own the ADV7842 dev board and or more magewell kit.

So the ADV7842 chip is used by AJA/Magewell/BE75/Blackmagic units, the reality is its one of the end chips market wise, so if you learn your hardware this little IC pops up a LOT and its specs on page 1 tell you why, it does everything and oversamples the input signal to the nth degree.

SDI to TB2 or TB3 is just a digital interface its still uncompressed 4:2:2 YUV feed, affixed SDI standards, the IC however is much more flexible.

image

When you start to dig into this subject and I mean waste a lot of hours of lifespan, you see why the decode method is here, its due to no one could be bothered to impliment every feature these AIO broadcast level chips are able to do into a actual product and software workflow, stuff like Odysee cards at 2k a pop just barly pass that hardware mark with VBI data pin output, Vrecord is wonderful if it had the full 4fsc signal frame available to it then it would be king of the ingest workflow hill.