blumzi / LAST_issues

A place to discuss and manage LAST issues
0 stars 0 forks source link

Matlab hangs for 5 minutes trying to open a serial port and then crashes #16

Open EastEriq opened 5 months ago

EastEriq commented 5 months ago

like, when connecting to a mount or a focuser. We meet this every now and then. So far I always blamed usb-RS232 converters (according to recurring past experiences of this kind). See this document: https://docs.google.com/document/d/1MmgxOs9dY3qVgBY4CaGAwsgn_dcTSemQThe9qvtxj1c

We believed that the cause could be static electricity putting the serial converter chip in a fault mode, and that the cure was the use of usbreset. However, the same problem appears occasionally on computers with a serial-PCI card. Today, I got it on last06e, on attempting to open the mount, which is on /dev/ttyS4.

blumzi commented 5 months ago

The entire first section in the error message is about a timed-out HTTP connection, no relation whatsoever to serial. The fact that the thing gets stuck for 5 minutes is strongly telling of an HTTP timeout. Moreover, it seems to be a secure connection attempt (SSL).

The failure report (RXTX Warning: Removing stale lock file. /var/lock/LCK..ttyUSB0) takes the very same remedy action as last-matlab does, by removing the stale LCK..tty file. The fact that en error message fills the LCK..tty file seems like a bug in matlab's error reporting code.

I can assist with debugging the failing HTTP connection, if it is reproducible. At first sight it looks like an HTTP proxy problem.

EastEriq commented 5 months ago

No, both the HTTP time out and the warning on the lock file are mere confounders. They show up all the time and never interfered with the unit operation. Treating them would just remove a minor nuisance.

The whole point is the hang and crash after 5 minutes. The beginning of the stack trace (which, I remark, appears after 5 minutes) points to the cause.

Stack Trace (from fault):
[  0] 0x00007f8e7bbf99dd /usr/local/MATLAB/R2020b/bin/glnxa64/librxtxSerial.so+00035293 Java_gnu_io_RXTXPort_readArray+00000141
[  1] 0x00007f90f901928e                                   <unknown-module>+00000
EastEriq commented 4 months ago

I have ran extensive stress tests on last11e in the office, with two focusers connected, two cameras connected but off, and mount on the PCIe card serial port. See https://github.com/EastEriq/LAST_CelestronFocusMotor/blob/mastrolindo/testing/serialopen.sh

I'm suspecting then that the problem is to be ascribed to EMI, poor shielding or something the like on the observatory floor.