drowe67 / freedv-gui

GUI Application for FreeDV – open source digital voice for HF radio
https://freedv.org/
GNU Lesser General Public License v2.1
209 stars 52 forks source link

Stop button freezes gui #582

Closed OH1KH closed 1 year ago

OH1KH commented 1 year ago

Just compiled 1.9.4 with my Fedora 37 (LXDE desktop)

Every version is getting better. No problems any more with audio configurations, but now the problem is "Start/Stop" button. Start goes ok, but when pressing "Stop" it really stops gui. It freezes. Only way is to close it from top right corner and start again to get it run. Decoder keeps on running on background but qui is dead. image

Same with easy setup/ test ptt. PTT works and rig transmits. When pressing "stop test" transmit stops and gui freezes. This time it will not close from top right corner X. It must be killed from task list. image

tmiw commented 1 year ago

I just tried both scenarios on macOS and it seems to work fine there, FWIW. Is it possible that it has something to do with PulseAudio specifically? Could you try compiling with PortAudio instead and let me know if that works any differently? (./build_linux.sh portaudio)

@barjac, @Tyrbiter, do you see any similar behavior on your end?

Tyrbiter commented 1 year ago

I just tried both scenarios on macOS and it seems to work fine there, FWIW. Is it possible that it has something to do with PulseAudio specifically? Could you try compiling with PortAudio instead and let me know if that works any differently? (./build_linux.sh portaudio)

@barjac, @Tyrbiter, do you see any similar behavior on your end?

I can start and stop without problems on Fedora 39, it's technically still a beta but is stable for me (except for occasional crashes in gnome-shell as GNOME 45 is still a .0 release in a -1 package). I don't use LXDE so I can't easily try that. Maybe the OP can try running freedv under gdb.

FWIW once Fedora 39 belatedly goes stable (October 31st?) then Fedora 37 goes EOL after 30 days, which means if this is down to library code versions then there will be no package updates after that and everything is therefore officially unsupported. It's a 6 month treadmill on the bleeding Fedora edge :)

Also note that pipewire-0.3.83 is currently referred to as 1.0RC3 so a stable release is pretty close for that and pipewire-pulse, currently Fedora 37 also has the -2 packages for pipewire-0.3.83 and also wireplumber-0.4.15-1 which means it should be pretty good and similar to Fedora 39 audio/video behaviour.

OH1KH commented 1 year ago

No change with './build_linux.sh portaudio'. Guess it has something to do with window system. (Why always doing stop of something crashes it?)

Fedora 37 is known selection. When it comes to EOL I will update to 38. That way I'm avoiding biggest bugs of latest release. (Let others make it stable first). Fedora is too fast upgrading anyway. Keeping up always the very latest OS is not my goal. But stayed with Fedora because started once with RedHat 4.3 and that has been the way.

Tyrbiter commented 1 year ago

I don't know much about LXDE, but maybe it will be more stable on Fedora 38.

I am finding Fedora 39 mostly stable, using the beta for a month or so. It is no less stable than Fedora 38 after the first few days. Like you I have been a RH user since somewhere in the RH 4.x days. I moved to Fedora Core in 2003.

tmiw commented 1 year ago

No change with './build_linux.sh portaudio'. Guess it has something to do with window system. (Why always doing stop of something crashes it?)

Something else to check as well is whether you have multiple receive enabled (Tools->Options->Modem). If it is enabled, try disabling it and seeing if that behavior improves any. There's also a "single threaded" option too that may have an impact when checked/unchecked.

If there's no change after messing around with those settings, can you let me know if disabling Hamlib/serial PTT clears up the problem?

OH1KH commented 1 year ago

No change with PTT or other settings change. Some debug dump texts from console and corresponding views(png). I think, if no bells are ringing, we leave this as is becuse I will upgrade Fedora (38) within this year. dumps.zip

Tyrbiter commented 1 year ago

I see you are using netrigctl, so maybe @barjac can comment on your dump text file?

tmiw commented 1 year ago

Hmm, what radio are you using? I've heard of issues with SparkSDR (SDR program used with the Hermes Lite) before, for example, but in that case it looked like a problem on the SparkSDR side.

tmiw commented 1 year ago

BTW I got another report of this happening, this time on Ubuntu with a Xiegu G90. This likely means there's an actual issue, unfortunately, so the next step is probably to figure out what's common (i.e. radio or some other part of the setup).

OH1KH commented 1 year ago

Radio is ICOM IC7300. Rigctld started from script to serve Cqrlog, Wsjtx, Fldigi and FreeDV. No problem with the others. Hamlib is quite new from GitHub. (4.6~git 2023-10-21T21:05:16Z SHA=fb49c0 64-bit)

OH1KH commented 1 year ago

I would start to debug and see what happens in code when this outputs to console as everything stops after the last line (except that audio decoder is still running):

network_flush called 3:rig_get_freq: elapsed=1ms 3:rig.c(2411):rig_get_freq returning(0) 2:rig_set_freq: elapsed=116ms 2:rig.c(2121):rig_set_freq returning(0) 2:rig.c(1531):rig_close entered 3:rig.c(7967):morse_data_handler_stop entered 3:rig.c(8010):morse_data_handler_stop returning(0) 3:rig.c(7926):async_data_handler_stop entered 3:rig.c(7959):async_data_handler_stop returning(0) 3:network.c(1106):network_multicast_publisher_stop entered network.c(967): Stopping multicast publisher 3:network.c(1143):network_multicast_publisher_stop returning(0)

Output occurs after "stop" button is pressed. Unfortunately wrong programming language for me ... :-(

I give a hint: If I choose "no PTT/CAT control" Start/Stop button works as it should. That indicates problem with rigctld communication.

If I start "rigctld -m1" (dummy) in command console and then use "Hamlib NET rigctl" 127.0.0.1:4532 with FreeDv it produces same halt as with IC7300. So it is a good way for developer to test this. I did not try "test PTT" with Dummy, but assume that it halts too because "Start/Stop" halts.

tmiw commented 1 year ago

Hamlib is quite new from GitHub. (4.6~git 2023-10-21T21:05:16Z SHA=fb49c0 64-bit)

I can't duplicate with the latest from https://github.com/Hamlib/hamlib (or with 4.5.5) but if I git checkout fb49c0, I can. This makes me think that it's an issue with that version of Hamlib and not FreeDV per se, especially since per gdb it looks like it's stuck in an infinite loop on this line:

0x00007ffff722d2ea in rig_close (rig=0x7fffb0000c20) at rig.c:1545
1545        while(rs->multicast_publisher_run != 2) hl_usleep(10*1000);

(This is with rigctld -m1 as per above.)

So yeah, I'd suggest pulling the latest Hamlib and trying again. Let us know how that goes!

barjac commented 1 year ago

I have not seen any issues with rigctld in 1.9.4 using hamlib-4.5.5. Hamlib errors can happen occasionally if the rig is turned off and back on but re-starting the daemon and/or freedv will fix it. I have not seen 'Stop' cause a crash though.

# systemctl restart rigctld.service Assuming rigctld is correctly being started from an enabled systemd unit file on boot.

I do find that a 'click' is often not enough for 'Stop' to work. Rather click/hold for 250ms or so is needed, which I now do each time.

OH1KH commented 1 year ago

Yep! [saku@hamtpad ~]$ rigctld --version rigctld Hamlib 4.6~git 2023-10-27T20:11:19Z SHA=7c5d4d 64-bit

Seems to work with FreeDV using Dummy or ic7300, so far so good. Next step is to check that Cqrlog, Qsstv and Fldigi are working too.

Perhaps in rig close while loop needs some kind of timeout. Could same infinite loop happen if rigctld suddenly just disappears at critical point?

tmiw commented 1 year ago

rig_close is Hamlib code and normally should close the serial port pretty quickly. It's possible the Hamlib project was working on another feature that temporarily broke that function; the latest from their GitHub is known for not being 100% stable. I'm surprised that the other programs seemed to work fine, though, since they would all need to call rig_close at some point while exiting.

In any case, I'll go ahead and close this issue for now since it appears resolved. We can reopen if it turns out there's some other issue in the FreeDV code related to this.