What is the right resolution time period for audio testing?

mbergman42 commented 2 years ago

From today's DPCTF Test Runner meeting: What is the right resolution time period for audio testing? That is, given 60 seconds of audio, what is the length of the sub-segments we need to examine? We previously concluded "on the order of 20mS"; that is, step through the recorded audio at 20mS intervals, making sure each one was in the correct position (presentationTime).

Currently the DPCTF spec says "audio sample", and a reasonable reading of that is ISO BMFF sample (section 3.1.9, "all the data associated with a single timestamp"), and typically the audio samples are 20mS. OK, currently the spec requires 20mS.

So the question is not, "How long is a sample?" The question now is, "What is the right resolution time, and why?" If 20mS is required, then the DPCTF spec is fine as written and we'll have to stay with wired audio (most likely, still some study underway). Or, if we can verify the important stuff with a longer resolution, like 2S, then we should change the DPCTF spec language and build our tests accordingly.

For example--if we're interested in seeing that CMAF segments play out in order, then 20mS is overkill. If we need to verify ISO BMFF audio samples, as the spec is currently written, then it's not overkill.

The motivation for asking this is, 20mS resolution on automated audio testing is “hard” (maybe not feasible) for playout over a speaker and pickup with microphone and ambient noise or music. If we can justify closer to 2 seconds resolution, we are closer to a realizable test rig. If we stay at 20mS resolution, that appears only feasible with wired audio.

cta-source commented 2 years ago

After some discussion in DPCTF and off-line, I'm of the opinion we need to stay on 20mS until we cannot support a required test environment. If speakers/mic/music/room noise really is not feasible at 20mS, we can reassess. Recommend closing this issue and continuing with the current path of developing 20mS wired-only tests first, and continuing to study opportunities for speaker/mic/music/noise.

jpiesing commented 2 years ago

After some discussion in DPCTF and off-line, I'm of the opinion we need to stay on 20mS until we cannot support a required test environment. If speakers/mic/music/room noise really is not feasible at 20mS, we can reassess. Recommend closing this issue and continuing with the current path of developing 20mS wired-only tests first, and continuing to study opportunities for speaker/mic/music/noise.

I agree.

cta-wave / mezzanine

What is the right resolution time period for audio testing? #44