Open mhaberler opened 5 years ago
I had an issue in the recent release with MDNS api calls locking up in certain conditions. This commit should have addressed that, but there may still be lingering issues: https://github.com/pothosware/SoapyRemote/commit/6b4eb5fe85754fa3219376b146870c448c57320a
Its just a guess, but if you were to put a return result here and skip all of the MDNS calls, it might stop locking up: https://github.com/pothosware/SoapyRemote/blob/master/common/SoapyMDNSEndpointApple.cpp#L252
At least that would point to something in this file as opposed to somewhere else.
returning a null result for now like so:
std::map<std::string, std::map<int, std::string>> SoapyMDNSEndpoint::getServerURLs(const int ipVer, const long timeoutUs)
{
SoapyMDNSBrowseResult result;
result.ipVerRequest = ipVer;
return result.serverURLs;
...
SoapySDRUtil --find works reliably with the patched libremoteSupport.so installed - no more hangs
PothosFlow not so much - sometimes a topology containing plotter widgets displays signals, sometimes not - even with libremoteSupport.so removed, and .conf/Pothos cleaned
reloading the plugins does help on and off
PothosFlow does exit without a hang (it used to with the unpatched libremotesupport.so)
Thanks for testing. The problem is that I can't replicate this. If you are still willing to try things, I think that one of the calls to DNSServiceProcessResult
is probably hanging
A little print before and after would confirm that. Its possible that they all need a wait with timeout change just like that previous commit that I mentioned. I'm just trying to figure this out, I dont think their API or examples mentioned this...
Ex:
- else DNSServiceProcessResult(sdRef)
+ else {
+ printf("here before\n");
+ DNSServiceProcessResult(sdRef);
+ printf("here after\n");
+}
PothosFlow not so much - sometimes a topology containing plotter widgets displays signals, sometimes not - even with libremoteSupport.so removed, and .conf/Pothos cleaned reloading the plugins does help on and off
Not sure, might be unrelated to the first issue. Any particular hardware involved, does it only happen with an SoapySDR source block, its fine otherwise? Do you have a minimal topology that shows the problem?
on the case
Q: I see https://github.com/pothosware/SoapySDR/commit/e6456d54b5494ce346e7749ed355d16b84140246
do I read this as 'Python 3.7' not supported? maybe that is part of the story
I use Python3.7.2 which came with brew - I wonder how other versions actually get install on OSX (macports? I'd rather just have one package manager)
Python 3.7 works. It's just that no Python interpreter > 3.2 will be automatically configured and no libs > 3.4 will be automatically found.
When you configure the SoapySDR project with cmake
the interesting lines are:
-- PYTHON3_EXECUTABLE: …
-- PYTHON3_INSTALL_DIR: …
-- PYTHON3LIBS_FOUND: …
-- PYTHON3_INCLUDE_DIRS: …
-- PYTHON3_LIBRARIES: …
But that's all just support for Python. The SoapySDR apps don't use it.
@guruofquality should I build against master or pothos-0.6.1 ?
Python 3.7 works. It's just that no Python interpreter > 3.2 will be automatically configured and no libs > 3.4 will be automatically found. When you configure the SoapySDR project with cmake the interesting lines are: -- PYTHON3_EXECUTABLE: … -- PYTHON3_INSTALL_DIR: … -- PYTHON3LIBS_FOUND: … -- PYTHON3_INCLUDE_DIRS: … -- PYTHON3_LIBRARIES: … But that's all just support for Python. The SoapySDR apps don't use it.
I have had a lot of trouble getting the correct python devel libs for homebrew. This could probably be done better. The python3 directory is a hack to make source installs easy on those dual python version linux boxes. In practice, the python/ direction is a stand alone cmake project that can build for any version of python as many as you want. The brew recipe could probably call that with whatever specific variables they provide for the correct python interpreter (or just whatever which python3
reports)
should I build against master or pothos-0.6.1
The release is probably best, but its not a dependency for just the soapy remote issue here.
I have Pothos, SoapyRemote, SoapySDR, SoapyRTLSDR debug builds from master on two debian stretch and an OSX host, all three have RTL-SDR sticks, the OSX host also has a LimeSDR Mini (which I did not use here); and a Redpitaya with just SoapyRemote/SoapySDR
test is a Soapy source linked to a Spectrogram widget.
These combinations work:
linuxA gui -> local RTLSDR linuxA gui -> linuxB RTLSDR via SoapySDRserver and vice versa linuxA gui -> OSX RTLSDR via SoapySDRserver OSX gui -> local RTLSDR ) OSX gui -> linuxA/B RTLSDR via SoapySDRserver )
*): OSX: on startup, all the RTLSDR remotes, and the local RTLSDR+LimeSDR on OSX appear - after a while only the local SDR's are shown, the remote RTLSDR's on Linux disappear. I saw them reappearing and disappearing several times. I guess this has nothing to do with the issue at hand though. I can switch between local and remote RTLSDR's back and forth just fine, spectrogram showing noise and signal as expected.
On clean PothosFlow startup on OSX (.config/Pothos/* cleaned) all this works until I select the Host Explorer tab and click on the localhost entry:
the log window shows:
[10:45:10.571000] SoapySDR: Configured receiver endpoint: dgram=1452 bytes, 714 elements @ 2 bytes, window=16 KiB
[10:48:57.153000] SoapySDR: SoapySSDPEndpoint::sendTo(udp://239.255.255.250:1900) = -1 sendto(udp://239.255.255.250:1900) [1: Operation not permitted]
[10:48:57.154000] SoapySDR: SoapySSDPEndpoint::sendTo(udp://239.255.255.250:1900) = -1 sendto(udp://239.255.255.250:1900) [1: Operation not permitted]
[10:48:57.447999] PothosFlow.SystemInfoTree: Failed to query system info tcp://[::1] - I/O error: RemoteProxyEnvironment::recvDatagram(): recvDatagram(): stream end
[10:48:58.239083] PothosFlow.EnvironmentEval: zone[]: I/O error: RemoteProxyEnvironment::recvDatagram(): recvDatagram(): stream end - Remote host tcp://[::1] is offline
the terminal window I started PothosFlow from shows:
2019-03-19 11:48:57 SoapySDR: SoapySSDPEndpoint::sendTo(udp://239.255.255.250:1900) = -1
sendto(udp://239.255.255.250:1900) [1: Operation not permitted]
2019-03-19 11:48:57 SoapySDR: SoapySSDPEndpoint::sendTo(udp://239.255.255.250:1900) = -1
sendto(udp://239.255.255.250:1900) [1: Operation not permitted]
2019-03-19 11:48:57 PothosFlow.SystemInfoTree: Failed to query system info tcp://[::1] - I/O error: RemoteProxyEnvironment::recvDatagram(): recvDatagram(): stream end
2019-03-19 11:48:58 PothosFlow.EnvironmentEval: zone[]: I/O error: RemoteProxyEnvironment::recvDatagram(): recvDatagram(): stream end - Remote host tcp://[::1] is offline
From then on the GUI hangs in the sense that the test cannot be activated anymore.
I think what happens is that the PothosUtil process(es) forked from PothosFlow die on OSX when using the Host Explorer tab - Pothos processes running as long as things are fine:
BigMac-1978:build mah$ ps|grep Pot
6277 ttys004 0:00.00 grep Pot
6272 ttys006 0:00.00 /bin/bash /usr/local/mahdev/bin/PothosFlow
6273 ttys006 0:02.01 /usr/local/mahdev/bin/PothosFlow.app/Contents/MacOS/PothosFlow
6274 ttys006 0:00.47 /usr/local/mahdev/bin/PothosUtil --require-active --proxy-server=tcp://[::1]:16415
6275 ttys006 0:01.09 /usr/local/mahdev/bin/PothosUtil --require-active --proxy-server=tcp://[::]
After host-exploring localhost the two background PothosUtil processes become defunct:
BigMac-1978:build mah$ ps|grep Pot
6282 ttys004 0:00.00 grep Pot
6272 ttys006 0:00.00 /bin/bash /usr/local/mahdev/bin/PothosFlow
6273 ttys006 0:07.15 /usr/local/mahdev/bin/PothosFlow.app/Contents/MacOS/PothosFlow
6274 ttys006 0:00.00 (PothosUtil)
It looks the SoapySDR stuff is not the culprit, rather PothosUtil. The following could be a hint:
I ran a PothosUtil proxy on OSX and host-explored it from a Linux GUI:
BigMac-1978:build mah$ PothosUtil --proxy-server=tcp://172.16.0.247:16415
Host: 172.16.0.247
Port: 16415
As soon as the linux PothosFlow host explore starts, the OSX PothosUtil catches an exception:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
I/O error [in file "/Users/mah/pothos/PothosCore/poco/Foundation/src/ErrorHandler.cpp", line 46]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
looks like this causing the dying background PothosUtil.
Running this under lldb:
BigMac-1978:build mah$ lldb PothosUtil
(lldb) target create "PothosUtil"
Current executable set to 'PothosUtil' (x86_64).
(lldb) b ErrorHandler.cpp:46
Breakpoint 1: where = libPocoFoundationd.48.dylib`Poco::ErrorHandler::exception(std::exception const&) + 16 at ErrorHandler.cpp:46, address = 0x0000000000061910
(lldb) r --proxy-server=tcp://172.16.0.247:16415
Process 6296 launched: '/usr/local/mahdev/bin/PothosUtil' (x86_64)
2019-03-19 12:05:00.713611+0100 PothosUtil[6296:1825663] [si_destination_compare] send failed: Invalid argument
2019-03-19 12:05:00.713646+0100 PothosUtil[6296:1825663] [si_destination_compare] send failed: Undefined error: 0
Host: 172.16.0.247
Port: 16415
Process 6296 stopped
* thread #30, stop reason = breakpoint 1.1
frame #0: 0x00000001031d4910 libPocoFoundationd.48.dylib`Poco::ErrorHandler::exception(this=0x00000001040000a0, exc=0x0000000103702580) at ErrorHandler.cpp:46
43
44 void ErrorHandler::exception(const std::exception& exc)
45 {
-> 46 poco_debugger_msg(exc.what());
47 }
48
49
Target 0: (PothosUtil) stopped.
(lldb) bt
* thread #30, stop reason = breakpoint 1.1
* frame #0: 0x00000001031d4910 libPocoFoundationd.48.dylib`Poco::ErrorHandler::exception(this=0x00000001040000a0, exc=0x0000000103702580) at ErrorHandler.cpp:46
frame #1: 0x00000001031d4a4e libPocoFoundationd.48.dylib`Poco::ErrorHandler::handle(exc=0x0000000103702580) at ErrorHandler.cpp:74
frame #2: 0x0000000102f1583b libPocoNetd.48.dylib`Poco::Net::TCPServerConnection::start(this=0x000000010372db20) at TCPServerConnection.cpp:53
frame #3: 0x0000000102f162c1 libPocoNetd.48.dylib`Poco::Net::TCPServerDispatcher::run(this=0x00000001037282b0) at TCPServerDispatcher.cpp:120
frame #4: 0x00000001033296bb libPocoFoundationd.48.dylib`Poco::PooledThread::run(this=0x0000000104500000) at ThreadPool.cpp:200
frame #5: 0x000000010332603a libPocoFoundationd.48.dylib`Poco::(anonymous namespace)::RunnableHolder::run(this=0x0000000104500a70) at Thread.cpp:57
frame #6: 0x000000010332428b libPocoFoundationd.48.dylib`Poco::ThreadImpl::runnableEntry(pThread=0x0000000104500038) at Thread_POSIX.cpp:349
frame #7: 0x00007fff61529305 libsystem_pthread.dylib`_pthread_body + 126
frame #8: 0x00007fff6152c26f libsystem_pthread.dylib`_pthread_start + 70
frame #9: 0x00007fff61528415 libsystem_pthread.dylib`thread_start + 13
(lldb)
unsure how to go from here.
ps: unrelated - the redpitaya SoapySDRServer never shows up in a Linux or OSX Soapy Source dropdown, despite being queried
here's a screen recording: http://static.mah.priv.at/private/pothos-bug.mov and the test case: http://static.mah.priv.at/private/mac-remote-rtlsdr.pothos wireshark capture - port 16415 exchange between Linux PothosFlow and OSX PothosUtil until first exception: http://static.mah.priv.at/private/linux-pothosflow-vs-osx-pothosutil-tcp-port16415.pcapng
ps: unrelated - the redpitaya SoapySDRServer never shows up in a Linux or OSX Soapy Source dropdown, despite being queried
There is just no discovery protocol or maybe there is and it was left out: https://github.com/pothosware/SoapyRedPitaya/issues/2 So the use case is to manually specify the "addr" https://github.com/pothosware/SoapyRedPitaya/wiki#probing-soapy-red-pitaya
do I read this as 'Python 3.7' not supported? maybe that is part of the story
I tested this an it automatically finds the right development files for python3:
-- PYTHON3INTERP_FOUND: TRUE
-- PYTHON3_EXECUTABLE: /usr/local/bin/python3
-- PYTHON3_INSTALL_DIR: ${prefix}/lib/python3.7/site-packages
-- Found Python3Libs: -L/usr/local/opt/python/Frameworks/Python.framework/Versions/3.7/lib/python3.7/config-3.7m-darwin -lpython3.7m -ldl -framework CoreFoundation
-- PYTHON3LIBS_FOUND: TRUE
-- PYTHON3_INCLUDE_DIRS: /usr/local/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/include/python3.7m;/usr/local/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/include/python3.7m
-- PYTHON3_LIBRARIES: -L/usr/local/opt/python/Frameworks/Python.framework/Versions/3.7/lib/python3.7/config-3.7m-darwin -lpython3.7m -ldl -framework CoreFoundation
-- CMAKE_SWIG_FLAGS=-c++;-threads
It looks the SoapySDR stuff is not the culprit, rather PothosUtil. The following could be a hint:
@mhaberler thanks for the info, its going to take me a bit to replicate. But thats going to be a fun one to track down, especially if whatever its reporting when the remote explorer requests the info causes the crash, if I cant get the same "bad state" or whatever it may be.
Going back to the SoapyRemote issue, there is still actually an MDNS bug here, there is an OSX API call locking up unexpectedly, -- and thats a totally separate still open issue, correct?
I can try with a debug build to get a better backtrace if needed
Actually yes please, I think running lldb on the PothosUtil was a great idea, there may be some debug prints as well.
I saw them reappearing and disappearing several times. I guess this has nothing to do with the issue at hand though.
So this is why I think we want the MDNS stuff fixed, because its actually how it finds the servers, it might be related. The SSDP is another fallback I implemented for the same purpose, and it looks like its not able to bind on ipv4 SoapySSDPEndpoint::sendTo(udp://239.255.255.250:1900)
so hanging on by a thread here, its working. :-P
so - piecemeal:
a) in the PothosUtil standalone case, the 'I/O error [in file ...ErrorHandler.cpp", line 46]' throw originates from an EOF condition here: https://github.com/pothosware/PothosCore/blob/master/lib/Remote/RemoteProxyDatagram.cpp#L111
that seems not fatal yet - I can run the test topo on a Linux GUI and set affinity for the SoapySDRSource to OSX, and that works fine
b) in the 'PothosUtil started from PothosFlow' case I find it very difficult to debug that - I can attach, but shortly thereafter the process exits(0) , see below
is there a way I can run PothosUtil standalone under the debugger and PothosFlow refraining from starting its own but rather connect to the already running PothosUtil? or at least crank up logging?
BigMac:~ mah$ ps
...
1495 ttys000 0:00.00 /bin/bash /usr/local/mahdev/bin/PothosFlow
1496 ttys000 0:01.41 /usr/local/mahdev/bin/PothosFlow.app/Contents/MacOS/PothosFlow
1497 ttys000 0:00.47 /usr/local/mahdev/bin/PothosUtil --require-active --proxy-server=tcp://[::1]:16415
...
BigMac:~ mah$ lldb -p 1497
(lldb) process attach --pid 1497
Process 1497 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
frame #0: 0x00007fff662bbd9a libsystem_kernel.dylib`__sigwait + 10
libsystem_kernel.dylib`__sigwait:
-> 0x7fff662bbd9a <+10>: jae 0x7fff662bbda4 ; <+20>
0x7fff662bbd9c <+12>: movq %rax, %rdi
0x7fff662bbd9f <+15>: jmp 0x7fff662b4381 ; cerror
0x7fff662bbda4 <+20>: retq
Target 0: (PothosUtil) stopped.
Executable module set to "/usr/local/mahdev/bin/PothosUtil".
Architecture set to: x86_64h-apple-macosx.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
* frame #0: 0x00007fff662bbd9a libsystem_kernel.dylib`__sigwait + 10
frame #1: 0x00007fff6637174c libsystem_pthread.dylib`sigwait + 52
frame #2: 0x00000001074b1d2a libPocoUtild.48.dylib`Poco::Util::ServerApplication::waitForTerminationRequest(this=0x00007ffee88776d8) at ServerApplication.cpp:585
frame #3: 0x00000001073c5b90 PothosUtil`PothosUtilBase::proxyServer(this=0x00007ffee88776d8, (null)="proxy-server", uriStr="tcp://[::1]:16415") at PothosUtilProxyServer.cpp:144
frame #4: 0x0000000107393616 PothosUtil`Poco::Util::OptionCallback<PothosUtil>::invoke(this=0x00007fe733433aa0, name="proxy-server", value="tcp://[::1]:16415") const at OptionCallback.h:92
frame #5: 0x00000001074583f4 libPocoUtild.48.dylib`Poco::Util::Application::handleOption(this=0x00007ffee88776d8, name="proxy-server", value="tcp://[::1]:16415") at Application.cpp:533
frame #6: 0x0000000107391958 PothosUtil`PothosUtil::handleOption(this=0x00007ffee88776d8, name="proxy-server", value="tcp://[::1]:16415") at PothosUtil.cpp:155
frame #7: 0x00000001074535c3 libPocoUtild.48.dylib`Poco::Util::Application::processOptions(this=0x00007ffee88776d8) at Application.cpp:405
frame #8: 0x0000000107451614 libPocoUtild.48.dylib`Poco::Util::Application::init(this=0x00007ffee88776d8) at Application.cpp:170
frame #9: 0x000000010744feee libPocoUtild.48.dylib`Poco::Util::Application::init(this=0x00007ffee88776d8, argc=3, argv=0x00007ffee88778f8) at Application.cpp:135
frame #10: 0x00000001074b1d9d libPocoUtild.48.dylib`Poco::Util::ServerApplication::run(this=0x00007ffee88776d8, argc=3, argv=0x00007ffee88778f8) at ServerApplication.cpp:601
frame #11: 0x000000010738b152 PothosUtil`main(argc=3, argv=0x00007ffee88778f8) at PothosUtil.cpp:216
frame #12: 0x00007fff6617aed9 libdyld.dylib`start + 1
frame #13: 0x00007fff6617aed9 libdyld.dylib`start + 1
(lldb) b exit
Breakpoint 1: 6 locations.
(lldb) c
Process 1497 resuming
Process 1497 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x00007fff662241a7 libsystem_c.dylib`exit
libsystem_c.dylib`exit:
-> 0x7fff662241a7 <+0>: pushq %rbp
0x7fff662241a8 <+1>: movq %rsp, %rbp
0x7fff662241ab <+4>: pushq %rbx
0x7fff662241ac <+5>: pushq %rax
Target 0: (PothosUtil) stopped.
(lldb) c
Process 1497 resuming
(lldb)
error: Process is running. Use 'process interrupt' to pause execution.
(lldb) c
error: Process is running. Use 'process interrupt' to pause execution.
Process 1497 exited with status = 0 (0x00000000)
(lldb)
OSX Mojave 10.14.3 (18D109) brew 2.0.4 install as per https://github.com/pothosware/homebrew-pothos/wiki
SoapyRemote not yet installed:
After
brew install soapyremote
(or install from source at a6fbf2db4 , no difference):Running lldb on the hanging SoapySDRUtil process:
I can try with a debug build to get a better backtrace if needed
Michael