sm0svx / svxlink

Advanced repeater system software with EchoLink support for Linux including a GUI, Qtel - the Qt EchoLink client
http://svxlink.org/
Other
433 stars 170 forks source link

Big issue with Voter when switching during ID transmission #97

Closed pe1chl closed 9 years ago

pe1chl commented 10 years ago

When operating a repeater with multiple receivers (with Voter), a small issue occurs.

When someone is transmitting and the repeater starts sending the ID, normally the ID output level is lowered (FX_GAIN_LOW) and the received audio is transmitted over it.

When the Voter decides to switch receiver in the middle of such an ID transmission, the ID suddenly is transmitted at normal level, and the received audio is muted until the ID transmission ends. Then the received audio continues as normal.

pe1chl commented 10 years ago

Additional information: When a station is coming in via receiver 1 before the ID starts, then the ID starts and the voter switches to receiver 2, at that time the received audio is muted and the ID is transmitted at full level. When before the end of the ID the voter switches back to receiver 1, the correct situation is restored: received audio at normal level, ID at reduced level. So the error only occurs during use of the receiver that was not selected when the ID started. (this now becomes apparent because we have drastically reduced the REVOTE_INTERVAL)

sm0svx commented 10 years ago

Ok. What revote interval are you using now? How well is that working (if we ignore the problem described above for a while)?

pe1chl commented 10 years ago

Tobias Blomberg wrote:

Ok. What revote interval are you using now? How well is that working (if we ignore the problem described above for a while)?

We changed the Voter parameters as follows:

VOTING_DELAY=200

REVOTE_INTERVAL=1000

HYSTERESIS=50

RX_SWITCH_DELAY=500

SQL_CLOSE_REVOTE_DELAY=500

VOTING_DELAY=0 REVOTE_INTERVAL=100 HYSTERESIS=50 RX_SWITCH_DELAY=250 SQL_CLOSE_REVOTE_DELAY=200

This was mainly to avoid two issues:

  1. the voting delay sometimes was still not high enough and the wrong receiver selected on squelch opening.
  2. once the wrong receiver is selected it persisted too long.

Now it of course still often selects the local receiver but at least it switches quickly to the remote rx if better.

We still have a problem we are trying to debug (hopefully the output of voter info is implemented soon): We see sequences like this in a remote rx:

Fri Sep 19 19:17:52 2014: RxUTR: The squelch is OPEN (2.5037) Fri Sep 19 19:17:53 2014: RxUTR: The squelch is CLOSED (4.39461) Fri Sep 19 19:17:53 2014: RxUTR: The squelch is OPEN (2.87498) Fri Sep 19 19:17:53 2014: RxUTR: The squelch is CLOSED (5.56992) Fri Sep 19 19:17:54 2014: RxUTR: The squelch is OPEN (4.955) Fri Sep 19 19:17:54 2014: RxUTR: The squelch is CLOSED (3.88426) Fri Sep 19 19:17:54 2014: RxUTR: The squelch is OPEN (5.4073) Fri Sep 19 19:17:55 2014: RxUTR: The squelch is CLOSED (7.62261) Fri Sep 19 19:17:55 2014: RxUTR: The squelch is OPEN (3.8498) Fri Sep 19 19:17:55 2014: RxUTR: The squelch is CLOSED (3.91091) Fri Sep 19 19:17:56 2014: RxUTR: The squelch is OPEN (6.41599) Fri Sep 19 19:17:56 2014: RxUTR: The squelch is CLOSED (3.04333) Fri Sep 19 19:17:57 2014: RxUTR: The squelch is OPEN (2.07114) Fri Sep 19 19:17:57 2014: RxUTR: The squelch is CLOSED (5.13202) Fri Sep 19 19:17:57 2014: RxUTR: The squelch is OPEN (2.06606) Fri Sep 19 19:17:58 2014: RxUTR: The squelch is CLOSED (6.41175) Fri Sep 19 19:17:58 2014: RxUTR: The squelch is OPEN (116.235) Fri Sep 19 19:17:58 2014: RxUTR: The squelch is CLOSED (111.742) Fri Sep 19 19:17:58 2014: RxUTR: The squelch is OPEN (118.447) Fri Sep 19 19:18:00 2014: RxUTR: The squelch is CLOSED (6.74714) Fri Sep 19 19:18:00 2014: RxUTR: The squelch is OPEN (5.36229) Fri Sep 19 19:18:00 2014: RxUTR: The squelch is CLOSED (2.93292) Fri Sep 19 19:18:00 2014: RxUTR: The squelch is OPEN (19.8889) Fri Sep 19 19:18:00 2014: RxUTR: The squelch is CLOSED (15.3964) Fri Sep 19 19:18:01 2014: RxUTR: The squelch is OPEN (112.746) Fri Sep 19 19:18:02 2014: RxUTR: The squelch is CLOSED (6.89519) Fri Sep 19 19:18:02 2014: RxUTR: The squelch is OPEN (6.72576) Fri Sep 19 19:18:03 2014: RxUTR: The squelch is CLOSED (2.23326) Fri Sep 19 19:18:03 2014: RxUTR: The squelch is OPEN (5.12898) Fri Sep 19 19:18:05 2014: RxUTR: The squelch is CLOSED (5.69182) Fri Sep 19 19:18:05 2014: RxUTR: The squelch is OPEN (115.668) Fri Sep 19 19:18:06 2014: RxUTR: The squelch is CLOSED (6.90079) Fri Sep 19 19:18:06 2014: RxUTR: The squelch is OPEN (10.5267) Fri Sep 19 19:18:11 2014: RxUTR: The squelch is CLOSED (7.03117) Fri Sep 19 19:18:11 2014: RxUTR: The squelch is OPEN (123.644) Fri Sep 19 19:18:12 2014: RxUTR: The squelch is CLOSED (119.151) Fri Sep 19 19:18:12 2014: RxUTR: The squelch is OPEN (6.05409) Fri Sep 19 19:18:14 2014: RxUTR: The squelch is CLOSED (5.00223) Fri Sep 19 19:18:14 2014: RxUTR: The squelch is OPEN (111.084) Fri Sep 19 19:18:15 2014: RxUTR: The squelch is CLOSED (7.99593) Fri Sep 19 19:18:16 2014: RxUTR: The squelch is OPEN (2.40199) Fri Sep 19 19:18:16 2014: RxUTR: The squelch is CLOSED (5.46886)

i.e. even with the rx well calibrated it still sometimes has high spikes. I have set SIGLEV_BOGUS_THRESH=125 on this remote rx to chop of the highest peaks. In the above case the station was mobile and working over main RX but on the remote rx there are these peaks, it gets selected and a lot more noise is heard, and the voter immediately switches back. When this happens during ID, part of the qso is lost. This receiver has no squelch (immediate detector output), I have no idea yet what causes this. But it surely excercises the Voter and shows this issue.

Rob

sm0svx commented 10 years ago

This rather sound like a new issue but anyway; could there be something strange in the audio? Short ranges of silence? A strong interferer? Try to record some audio including the problem and have a look at it (e.g. using audacity) to see if something stange can be spotted. One way to record audio as it looks when entering SvxLink is to use the RAW_AUDIO_UDP_DEST

RAW_AUDIO_UDP_DEST
Setting this configuration variable makes it possible to stream the raw audio
from the sound device to an UDP socket. The sample format is the one used
internally in SvxLink, that is each sample is represented by a 32 bit float.
The sample rate is the same as the one chosen for the audio device.
The destination is specified as ip-address:port.

Example: RAW_AUDIO_UDP_DEST=127.0.0.1:10000

One way to write it to a file is to use the "socat" utility.

pe1chl commented 10 years ago

Tobias Blomberg wrote:

This rather sound like a new issue but anyway; could there be something strange in the audio? Short ranges of silence? A strong interferer?

Probably yes. What is interesting is that it only occurs when a mobile station is transmitting that is weak on this receiver.

Try to record some audio including the problem and have a look at it (e.g. using audacity) to see if something stange can be spotted. One way to record audio as it looks when entering SvxLink is to use the RAW_AUDIO_UDP_DEST

|RAW_AUDIO_UDP_DEST Setting this configuration variable makes it possible to stream the raw audio from the sound device to an UDP socket. The sample format is the one used internally in SvxLink, that is each sample is represented by a 32 bit float. The sample rate is the same as the one chosen for the audio device. The destination is specified as ip-address:port.

Example: RAW_AUDIO_UDP_DEST=127.0.0.1:10000 |

One way to write it to a file is to use the "socat" utility.

Yes, this is something I still want to try. In fact what I did on the main system before: make a /etc/asound.conf with

pcm.dsnoopAudigy { type dsnoop ipc_key 884234 slave { pcm "hw:Audigy,0" channels 2 period_size 1024 buffer_size 2048 rate 48000 } }

pcm.plug_dsnoopAudigy { type plug slave { pcm "dsnoopAudigy" } }

Then use that as the input, and you can open it in other programs as well at the same time. For example I ran "fldigi" and watched the audio waterfall. However I got distracted when there appeared to be an issue in the "portaudio" library that fldigi uses to access the soundcards, resulting in trouble when it is started (it opens all sound devices it can find and tries to set them to different modes, leaving them in the last tried mode...) This is something I want to fix, because it is otherwise very useful.

Rob

sm0svx commented 10 years ago

Yes, you can use Alsa as well but taking the audio from SvxLink will guarantee that you record exactly the same audio that SvxLink see. If an RTL dongle is used as a receiver it's the only way to get the hands on the raw audio.

The Alsa trick is of course good to know as well. That probably is the way to go if you want to split the audio in production. The RAW_AUDIO_UDP_DEST feature was mainly added for debug.

sm3sgp commented 10 years ago

Regarding the high OPEN siglev values, have you tried to play with the SQL_DELAY parameter? What squelch type are you using?

pe1chl commented 10 years ago

We use SQL_DET=SERIAL with a hardware signal from the receiver that indicates sufficient signal level or CTCSS presence. I will further debug the problem that the audio-out appears to be gated by this indication while the config says it isn't. (and set SQL_DELAY to see if that helps) But let's not get distracted from the original issue in the Voter with the ID transmit, this problem with our config was only making that more apparent but it is not the main problem, and that occurs on our other repeater as well. It starts to get ever more annoying once you notice it :-)

pa1okz commented 9 years ago

This topic can be renamed from "small issue" to "major issue" now. On both repeaters PI2NOS and PI3UTR the problem occurs almost each time during QSO when the ID comes - which is every 5 minutes. Users start to seriously complaint about it since significant pieces of transmissions are lost in the QSO's, which I can understand. I don't want to force the issue - we are all hobbyists, but hope that a little priority could be given to this issue. Many thanks...

sm3sgp commented 9 years ago

Do you have any recording of this that you can share?

pe1chl commented 9 years ago

I'm not sure what that will do towards solving the problem. It will take time to locate it in a recording, edit it out, find out how to upload the clip here, and what you will hear is nothing more than what is already described in the first (and sometimes the second) posting in this thread. (the audio of the speaking station is replaced by the CW or Phone ID message played at FX_GAIN_NORMAL instead of FX_GAIN_LOW, and it comes back when either the ID message ends or the Voter switches back the the receiver that was active when the ID started)

sm0svx commented 9 years ago

This issue have high priority but I have not had much time to spend on SvxLink for a while, unfortunately :-/ I'm not too fond of having a buggy software out there so this one is definetly high up on the TODO list.

Arn't there any settings you can change to work around the issue until it's fixed? Is it some kind of law that you need to ID every five minutes?

pe1chl commented 9 years ago

Of course we are very happy that you write this software! The issue has become more apparent due to the quick voter switching, we could try reverting that to see if it becomes more tolerable. That voter parameter change has otherwise improved the performance. The SQL_DELAY has fixed the other problem (I still need to check the raw audio, maybe tomorrow, to see what is really wrong).

Indeed we need to ID every 5 minutes (ridiculous for a repeater with special unattended permit on a fixed frequency pair... but I don't make the rules...).

pe1chl commented 9 years ago

Just for information: I have made some captures using the RAW_AUDIO_UDP_DEST feature (apparently something that was added quite recently) and I confirmed the problem of the excessive siglev values to be caused by an obscure problem with the receiver. The workaround of using SQL_DELAY=40 has been working OK for some time now, and it looks like it has to remain in. The receiver is a highly configurable commercial repeater, and although it is configured to not mute the audio and to provide its signal detection on an output pin that we connected to the serial port, in fact it does mute the audio briefly at the end of every valid reception, to return to noise after 100ms or so. This causes the misbehaviour of the Voter when a mobile station is just opening the squelch on that receiver, as the Voter sees very good siglev when it samples in that silence interval after squelch close. But of course, that all has nothing to do with the issue reported in the first and second posting in this thread, that still remains (when the receiver is switched for a valid reason).

pa1okz commented 9 years ago

SM3SGP, you have requested an audio file, which I have captured inbetween from the PI2NOS repeater.

The audio file can be heared over this link: http://yourlisten.com/mischa.vansanten/pi2nossvxlinkidissue

What happens is exactly the thing that we have described:

Hope this helps.

sm0svx commented 9 years ago

I just wanted to tell you that I have now managed to set up a testbench for the voter problem. I can now easily reproduce the problem so it's a first step towards fixing the bug.

pe1chl commented 9 years ago

Tobias Blomberg wrote:

I just wanted to tell you that I have now managed to set up a testbench for the voter problem. I can now easily reproduce the problem so it's a first step towards fixing the bug.

— Reply to this email directly or view it on GitHub https://github.com/sm0svx/svxlink/issues/97#issuecomment-58811172.

Great!

You can hear our repeaters on: http://icecast.pe1rjv.nl:8000/pi2nos http://icecast.pe1rjv.nl:8000/pi3utr http://icecast.pe1rjv.nl:8000/pi6ten (the pi6ten repeater currently has only 1 active receiver so not suitable for checking this issue)

sm0svx commented 9 years ago

I seem to have found a solution. Please test the "issue97" branch.

Pull request: #110

pe1chl commented 9 years ago

Thank you. I have installed it, let's see!

pa1okz commented 9 years ago

The first practical experiences indicate that the RX audio is no longer muted at receiver switchover during ID, so far so good. Let's further analyse this over the next number of days. Thanks for your effort so far.

Please don't take your testbench apart yet, we will come up with another topic regarding sometimes unexpected receiver switching with weak and strongly fluctuating signals. First some internal discussion within the team is needed to pinpoint it exactly. We will open a separate topic for that with some questions at first.

sm0svx commented 9 years ago

My testbench is based on GNU Radio so it's not much to take apart :-)

The fix also seem to have the side-effect to actually make the switch much smoother. Without the fix there is a small gap of silence between switching receivers. This is also often followed by some audio stutter. This is probably an effect of how the code worked previously. It emptied the audio pipe before switching to the new receiver, so all buffering was lost. This very same thing, waiting for the audio pipe to get empty, was also causing the rx audio muting since the announcement message needed to play to finish before the audio pipe was considered empty. With the new code, the switch to the new receiver is done instantaneously.

About your other problem, the signal level detector is designed to estimate the maximum strength of the signal during a time interval. I don't remember exactly how long that interval is though. This will make it ignore shorter dips in the signal strength. I have designed it this way since speech otherwise would affect the estimate.

pe1chl commented 9 years ago

Tobias Blomberg wrote:

My testbench is based on GNU Radio so it's not much to take apart :-)

The fix also seem to have the side-effect to actually make the switch much smoother. Without the fix there is a small gap of silence between switching receivers. This is also often followed by some audio stutter. This is probably an effect of how the code worked previously. It emptied the audio pipe before switching to the new receiver, so all buffering was lost. This very same thing, waiting for the audio pipe to get empty, was also causing the rx audio muting since the announcement message needed to play to finish before the audio pipe was considered empty. With the new code, the switch to the new receiver is done instantaneously.

Yes I already noticed that! I wanted to comment on that after evaluating the performance a bit more. But it looks good! There would probably be even smoother switching when the audio would be synchronized using timestamps similarly to what we do with co-channel.

About your other problem, the signal level detector is designed to estimate the maximum strength of the signal during a time interval. I don't remember exactly how long that interval is though. This will make it ignore shorter dips in the signal strength. I have designed it this way since speech otherwise would affect the estimate.

Looking in the code it appears that it measures the SNR over a 25ms interval, what we have been discussing is that it may be a good idea to perform this measurement every 100ms and put the results in a moving average, and then make the decision with a smaller hysteresis.

Rob

sm0svx commented 9 years ago

It's a little bit more complicated but anyway, open a new issue if you want to discuss other issues. This issue will be closed automatically when I merge the issue97 branch.

pe1chl commented 9 years ago

Yes, it is our intention to do that. For now we are evaluating the performance after the latest change, and it looks good. I had always wondered why there was such a gap when switching receiver even though in the code I saw all the effort done by using FIFO etc, but now it works like it should. In fact it is now very hard to detect the switchovers, which makes it a little more difficult to point out incorrect switchovers than it was before :-) When there is a solution for issue #65 / issue #99 it could be easier to track what is happening. For now it looks like you made a very good change. I have not yet heard a case of ID transmit over a QSO. Thanks!

pa1okz commented 9 years ago

I think this issue can be closed. Since the change, no audio interrupts have been noticed anymore and, indeed, diversity switching now definitely runs smoother as well. Thank you Tobias.

sm0svx commented 9 years ago

Great! I'll merge it.