Automated vs Human monitored audio testing

pmaness commented 3 years ago

I support what Dolby is proposing for audio imprinted with coded spectral bands for automated testing. However, it might be useful to have a separate, and likely much more limited, set of files suitable for human monitoring. This would be useful for an initial verification that the test rig and DUT are operating as expected, or other cases where some debugging is necessary.

jpiesing commented 3 years ago

This would be useful for an initial verification that the test rig and DUT are operating as expected, or other cases where some debugging is necessary.

I very much agree with both of these points.

jpiesing commented 3 years ago

I'm not sure what this is asking for. @pmaness

Is it asking for files with (for example) rising & falling frequencies where a human might hear a discontinuity? Separate files with this would be a pain.
Is it asking for something like we with video, something distinctive at the start & end of the mezzanine content so a human could at least notice if the playback was being cropped.

cta-source commented 3 years ago

This topic addresses the point I raised in a recent DPCTF Test Runner meeting, design of audio tests (now that we have some results from the audio watermark study). The broader question is, what is the full set of audio tests, so this is helpful.

@pmaness, regarding something

useful for an initial verification that the test rig and DUT are operating as expected, or other cases where some debugging is necessary

One debugging tool might be a descending PN sequence--our standard PN sequence, full scale at 0-1 seconds, down 1dB at 1-2s, down 2dB from 2-3s, and so on. Wired, we should be able to catch every PN segment starting from 0-1s. Speaker/mic, we'd be able to detect the effect SNR. If the equipment or environment have problems -- say, the user selected the wrong microphone on the OF computer and has terrible recorded audio, or there is a lot of background noise in the test room -- this would show up in the detected audio SNR with this waveform. (The resulting SNR would be an estimate only, but should be a good indicator.)

Ascending and descending tones, may also be helpful for human listeners, but as you probably know, chord progressions are better for human testing since pure tones create audio spatial nulls (which humans perceive as possible audio dropouts).

@jpiesing, regarding,

something like we with video, something distinctive at the start & end of the mezzanine content so a human could at least notice if the playback was being cropped

We could do a start-of-audio click at the front of each audio track. (I'd like to move from beeps to clicks anyway, at least for human sync purposes. A click is more precise in timing.) The initial click would be timed to match the "start of video" signal, ofc.

These are just reactions or ideas. As we develop our full set of audio tests, we can keep "humans" in mind. But as a starting point,

Audio Tests for System Verification System SNR Estimation -- Automated test; use to verify equipment meets basic minimum quality-of-transmission requirements Chord Progressions -- Human listening test; use when desiring to hear basic system operation. Built from synthesized chords with a click before and after each one, e.g.: Click, C-major (1 second), click, D-major (1s), click, E-major (1s), etc. Actual chords are tbd.

pmaness commented 3 years ago

One that we use in DTS labs is a channel name call out (a human voice) synced to an animated speaker layout, highlighting the called channel. You know on the first call out (left channel) that video is aligned (more or less) and audio is being routed correctly. If a system is configured for multi-channel, this allows an initial check that test environment is correct. Click / flash tracks are much better for measuring A/V sync, and A/V drift in an automated environment. Pure tones can be used to verify correct downmix in an automated environment.

gitwjr commented 1 year ago

Part of RFC. Deferred for future work.

cta-wave / mezzanine

Automated vs Human monitored audio testing #37