blumzi / LAST_issues

A place to discuss and manage LAST issues
0 stars 0 forks source link

Lost connection to M8 #11

Open noralinn opened 6 months ago

noralinn commented 6 months ago

Throughout the night, there were many errors that were remediated automatically. (Every few minutes.)

Examples:

{inst.XerxesMountBinary} Command error #128: Unknown Copley error code
{inst.XerxesMountBinary} -- error in query: "gr 0xC8  node 0"
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} -- error in query: "s 0xC8 1 node 0"
{inst.XerxesMountBinary} error while attempting to change HA
{inst.XerxesMountBinary} Slewing is complete
...
{inst.XerxesMountBinary} wrong checksum in Copley reply
{inst.XerxesMountBinary} -- error in query: "gr 0xCA  node 32"
{inst.XerxesMountBinary} Slewing is complete

But in the end, while slewing with Unit.Mount.goTo, the mount became completely unreachable (see below). I've had this issue before, it seems to happen once every few nights.

I only called Unit.shutdown afterwards. Parking probably didn't work, I asked observers to check to mount position in the morning.

Observe field 23 out of 23 - RA=210.000000, Dec=78.750000, Alt=30.729499
Result without PM:
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} wrong status register reply got from controller
{inst.XerxesMountBinary} wrong status register reply got from controller
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} wrong status register reply got from controller
{inst.XerxesMountBinary} wrong status register reply got from controller
{inst.XerxesMountBinary} fault while converging to target RA
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} wrong status register reply got from controller
{inst.XerxesMountBinary} wrong status register reply got from controller
{inst.XerxesMountBinary} Mount status: unknown
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} cannot read HA motor position
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} cannot read Dec motor position
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} cannot read Dec motor position
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} cannot read HA motor position
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} cannot read Dec motor position
Actual pointing: RA=NaN, Dec=NaN, HA=NaN, LST=140.107743               <-------- my program printed this one
{inst.XerxesMountBinary} wrong checksum in Copley reply
{inst.XerxesMountBinary} -- error in query: "gr 0x32  node 0"
{inst.XerxesMountBinary} wrong checksum in Copley reply
{inst.XerxesMountBinary} -- error in query: "gr 0x32  node 0"
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} cannot read HA motor position
{inst.XerxesMountBinary} wrong checksum in Copley reply
{inst.XerxesMountBinary} -- error in query: "gr 0x32  node 32"
{inst.XerxesMountBinary} wrong checksum in Copley reply
{inst.XerxesMountBinary} -- error in query: "gr 0x32  node 32"
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} cannot read Dec motor position
{inst.XerxesMountBinary} wrong checksum in Copley reply
{inst.XerxesMountBinary} -- error in query: "gr 0x32  node 0"
{inst.XerxesMountBinary} wrong checksum in Copley reply
...
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} wrong status register reply got from controller
{inst.XerxesMountBinary} wrong status register reply got from controller
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} wrong checksum in Copley reply
{inst.XerxesMountBinary} -- error in query: "s 0xA4 7fffffff node 0"
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} wrong checksum in Copley reply
{inst.XerxesMountBinary} -- error in query: "s 0xA4 7fffffff node 32"
Operation terminated by user during binaryRead (line 53)

In binaryQuery (line 81)
                [resp,~,checksumerr]=X.binaryRead(1);

In inst.XerxesMountBinary/clearFaults (line 16)
      X.binaryQuery('s',uint32([164,2^31-1]),X.DecAddress);

In obs.unitCS/readyToExpose (line 101)
                                Unit.Mount.clearFaults;

In getPointingModelTestData (line 57)
        if ~Unit.readyToExpose('Wait',true, 'Timeout',60)

>> Unit.shutdown
{obs.unitCS}   parking the mount...
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
{inst.XerxesMountBinary} impossible conversion: 0 databytes received
{inst.XerxesMountBinary} wrong status register reply got from controller
{inst.XerxesMountBinary} wrong status register reply got from controller
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} timeout while waiting for a complete response from controller
{inst.XerxesMountBinary} cannot read from serial resource
{inst.XerxesMountBinary} query failed after 2 retries
...
EastEriq commented 6 months ago

Mount 8 is known to be disturbed in communication by EMI, caused by the motors when tracking, much more than other mounts. I think I had an estimate of 1 corrupted command every 100, or something like that. This is why we switched to the Copley communication protocol with checksums, and implement retries. The messages of the first kind thus are to be intended as warnings, but prove that things succeeded when retried.

The verbose list of messages below is typical of trying to communicate with an offline controller. I could guess, that unless a Unit.Mount.disconnect was deliberately issued, the problem could be that the usb-232 dongle went fishing (typical of EMI as well). Only, I'm not sure, but probably on last08e we are usinga PCI serial card for that reason - I'll check.

EastEriq commented 6 months ago

Yes, M8 is connected to the card serial port. Maybe also these cards join the fishing party, differently than what I assumed.

In such cases I could suggest to try Unit.Mount.connect again, maybe from what is reported we could get a bit of hint.

EastEriq commented 3 months ago

Simone reported similar issues, with Unit.Mount.Reset not reconnecting and dumping lots of reports about failed queries like in the above. Looking at his matlab session though, there seems to be more going on, like slaves in parallel querying mount position and failing, because there are Messenger complaints when he tried to abort with Ctrl-C