Open EastEriq opened 10 months ago
The entire first section in the error message is about a timed-out HTTP connection, no relation whatsoever to serial. The fact that the thing gets stuck for 5 minutes is strongly telling of an HTTP timeout. Moreover, it seems to be a secure connection attempt (SSL).
The failure report (RXTX Warning: Removing stale lock file. /var/lock/LCK..ttyUSB0) takes the very same remedy action as last-matlab does, by removing the stale LCK..tty file. The fact that en error message fills the LCK..tty file seems like a bug in matlab's error reporting code.
I can assist with debugging the failing HTTP connection, if it is reproducible. At first sight it looks like an HTTP proxy problem.
No, both the HTTP time out and the warning on the lock file are mere confounders. They show up all the time and never interfered with the unit operation. Treating them would just remove a minor nuisance.
The whole point is the hang and crash after 5 minutes. The beginning of the stack trace (which, I remark, appears after 5 minutes) points to the cause.
Stack Trace (from fault):
[ 0] 0x00007f8e7bbf99dd /usr/local/MATLAB/R2020b/bin/glnxa64/librxtxSerial.so+00035293 Java_gnu_io_RXTXPort_readArray+00000141
[ 1] 0x00007f90f901928e <unknown-module>+00000
I have ran extensive stress tests on last11e in the office, with two focusers connected, two cameras connected but off, and mount on the PCIe card serial port. See https://github.com/EastEriq/LAST_CelestronFocusMotor/blob/mastrolindo/testing/serialopen.sh
Test 1: runs 1200+ times: opening matlab in batch mode, following line and close:
F=inst.CelestronFocuser('11_1_1'); F.connect; F.Status
Result: only one matlab_crash_dump, but on a seemingly unrelated trace:
[ 0] 0x00007f27f56892d9 /usr/local/MATLAB/R2020b/bin/glnxa64/ddux/ddux_impl/mwddux_impl.so+01405657
[ 1] 0x00007f27f568c49b /usr/local/MATLAB/R2020b/bin/glnxa64/ddux/ddux_impl/mwddux_impl.so+01418395
[ 2] 0x00007f28128ba164 /usr/local/MATLAB/R2020b/bin/glnxa64/libmwmst.so+00975204 _ZN7mwboost4asio6detail9scheduler3runERNS_6system10error_codeE+00001252
[ 3] 0x00007f27f56e7742 /usr/local/MATLAB/R2020b/bin/glnxa64/ddux/ddux_impl/mwddux_impl.so+01791810 _ZN7mwboost4asio10io_context3runEv+00000050
[ 4] 0x00007f281316e482 /usr/local/MATLAB/R2020b/bin/glnxa64/libmwboost_thread.so.1.70.0+00062594
[ 5] 0x00007f2813bfe609 /lib/x86_64-linux-gnu/libpthread.so.0+00034313
[ 6] 0x00007f2813924163 /lib/x86_64-linux-gnu/libc.so.6+01175907 clone+00000067
Test 2: runs 1000+ times: opening matlab in -nodesktop mode, following line and close:
datetime, F=inst.CelestronFocuser('11_1_1'); F.connect; F.Status, exit
Result: one crash_dump with exactly the same stack trace.
Test 3: runs 500+ times: opening matlab in desktop mode, following line and close:
Unit=obs.unitCS('00'); Unit.connect; exit
(I've created a special unit configuration '00' which turns on the mount controller, but not the cameras, nor spawns slaves). Result: no crash at all.
I'm suspecting then that the problem is to be ascribed to EMI, poor shielding or something the like on the observatory floor.
Seen on M8 today (also using /dev/ttyS4
), but disappeared in the next Matlab session
like, when connecting to a mount or a focuser. We meet this every now and then. So far I always blamed usb-RS232 converters (according to recurring past experiences of this kind). See this document: https://docs.google.com/document/d/1MmgxOs9dY3qVgBY4CaGAwsgn_dcTSemQThe9qvtxj1c
We believed that the cause could be static electricity putting the serial converter chip in a fault mode, and that the cure was the use of
usbreset
. However, the same problem appears occasionally on computers with a serial-PCI card. Today, I got it on last06e, on attempting to open the mount, which is on/dev/ttyS4
.