drowe67 / freedv-gui

GUI Application for FreeDV – open source digital voice for HF radio
https://freedv.org/
GNU Lesser General Public License v2.1
206 stars 52 forks source link

freedv_2_0_0 sample drops outs #750

Open drowe67 opened 1 week ago

drowe67 commented 1 week ago

Some evidence of audio drops outs, which will critically impair all modern waveforms (700D/E/RADE). See also https://github.com/drowe67/radae/pull/31

Success criteria:

  1. Objective evidence of no drops outs on builds for any OS that is to be released.
  2. Automated test (not manually viewing Audacity)
  3. For example when using RADE 0 resyncs in 120 seconds, or continuous transmission of a sine wave without phase jumps.
  4. On a heavily loaded machine (e.g. RADE decode, ~50%)
  5. Tx and Rx audio must be tested 9could be loopback on one machine).
  6. Benign channel (not OTA), e.g. OTC RF or over the audio cable.
  7. Results reviewed before sign off.
tmiw commented 3 days ago

@drowe67, I have https://github.com/drowe67/freedv-gui/pull/761 created to do some of that automated testing. Basically:

$ sudo modprobe snd-aloop enable=1,1,1,1 index=1,2,3,4
$ cd build_linux
$ ../test_zeros.sh tx [700D|700E|1600|RADE]
or for RX:
$ ../test_zeros.sh rx [700D|700E|1600|RADE] recording_from_kiwisdr.wav

If there are dropouts, you'll see stuff like this for each dropout that was detected:

Zero audio detected at index 432499 (54.062375 seconds) - lasted 0.017125 seconds

What test_zeros.sh does is the following:

  1. Start recording the output device (and playing the input device, if needed).
  2. Start freedv-gui in "unit test" mode: a. Switches mode to the one provided at the command line. b. If transmitting, "pushes" the PTT button. c. After 60 seconds, closes the application. (Exact time period we should use is TBD.)
  3. Stops playback/recording and uses sox to trim silence from the beginning/end of the recording.
  4. Runs check-for-zeros.py to check for dropouts.

I think this can also be done on macOS and Windows as long as we can figure out a way to automatically record or playback files.

Anyway, it would be nice if you (and maybe @Tyrbiter / @barjac too) can try that PR/script and see what the results are on your respective machines. Things seem to check out mostly okay for me anyway (maybe a dropout every few runs).

drowe67 commented 2 days ago

Hi @tmiw. Fine business on working towards an automated tests. Some comments on the test design:

  1. I'm pleased that freedv-gui has a unit test mode to support automated tests run from the command line.
  2. I think the test should be run with hardware (a real sound card) in the loop, using virtual sound cards is too many steps away from real life.
  3. Rather than looking for gaps in audio using python tools, look for re-syncs in RADE. That's a very direct measure of the key problem we are trying to head off. Re-syncs in 700X would also arguably be OK as a metric - but they don't hammer the CPU as much so RADE is the best choice. This implies we need a way of checking the resync counter, ie have it dumped by the app somewhere.
  4. Suggestion for test design - one machine running full duplex with an audio loopback cable. Running Tx and Rx at the same time means to pass there must be no gaps in either Tx or Rx audio.
  5. As an engineer who writes test plans for a living - I get pretty nervous about statements like "Things seem to check out mostly okay for me". We need a solid pass/fail condition - like "0 resyncs in a 120 second test window". Or if it's better to use statistical approach, "zero resyncs in 8/10 tests attempted". The exact condition doesn't matter too much as long as its well thought out (those were just examples), but it needs to be solid, numercial, and stated up front.
  6. It may not be possible to make this test fully automated and reproducable on every push (a bit like the SM1000 unit tests). But we could at least say "at Git ##, this test was performed by David, Mooneer, and Brian and it passed"
tmiw commented 1 day ago
  1. I think the test should be run with hardware (a real sound card) in the loop, using virtual sound cards is too many steps away from real life.

Wouldn't this introduce additional variables? The idea is to rule out any issues in freedv-gui and/or RADE, right? Not issues with e.g. the user's hardware, the kernel/modules they're using, etc.

  1. Rather than looking for gaps in audio using python tools, look for re-syncs in RADE. That's a very direct measure of the key problem we are trying to head off. Re-syncs in 700X would also arguably be OK as a metric - but they don't hammer the CPU as much so RADE is the best choice. This implies we need a way of checking the resync counter, ie have it dumped by the app somewhere.

Maybe this can be logged to stdout/stderr by freedv-gui and parsed by the Python tool instead? Or maybe another way to provide that information to the test tool is better? (Suggestions welcome.)

  1. Suggestion for test design - one machine running full duplex with an audio loopback cable. Running Tx and Rx at the same time means to pass there must be no gaps in either Tx or Rx audio.

Would this make it more difficult to find the source of the problem? i.e. we'd only have visibility on the RX side (increased resyncs) even if the gaps got introduced during TX.

  1. As an engineer who writes test plans for a living - I get pretty nervous about statements like "Things seem to check out mostly okay for me". We need a solid pass/fail condition - like "0 resyncs in a 120 second test window". Or if it's better to use statistical approach, "zero resyncs in 8/10 tests attempted". The exact condition doesn't matter too much as long as its well thought out (those were just examples), but it needs to be solid, numercial, and stated up front.

With more testing, I've found that I have zero resyncs in 60 seconds most of the time but occasionally there are runs where there are dropouts. I'm not sure why yet but I think with more runs by others we can come up with a spec for this.

  1. It may not be possible to make this test fully automated and reproducable on every push (a bit like the SM1000 unit tests). But we could at least say "at Git ##, this test was performed by David, Mooneer, and Brian and it passed"

How can we ensure that this test gets rerun occasionally? I can see us doing the testing once and then forgetting about it for quite a while if there's no way to enforce this.

Tyrbiter commented 15 hours ago

Some of the packages I need to rebuild have testing scripts built in to the build process, if the build doesn't complete then the package files are not created and so cannot be installed.

Could we look at something of this nature?

tmiw commented 12 hours ago

Some of the packages I need to rebuild have testing scripts built in to the build process, if the build doesn't complete then the package files are not created and so cannot be installed.

Could we look at something of this nature?

Without the use of loopback audio it'd be difficult to automate that as part of the packaging process, no?