nexdome / ASCOM

NexDome ASCOM Driver
https://www.nexdome.com
MIT License
6 stars 7 forks source link

shutter / rotator lose connection or driver crash after hitting stop in ascom device hub #19

Closed markm75 closed 4 years ago

markm75 commented 4 years ago

I've been able to replicate this issue.

If you happen to hit stop during a shutter or rotation motion, the driver or connectivity basically crashes. This on firmware 3.2

The only recourse i have found to quickly repair the connection is to push the rotator 3.2 firmware again and cycle the power. Otherwise you cant regain connectivity.

NameOfTheDragon commented 4 years ago

I don't know if there are any specific circumstances that I'm not aware of, but for me hitting STOP in DeviceHub doesn't cause any problems at all. Is there any chance you could record it happening? You could use the Windows Problems Steps Recorder (type 'psr' into the search box on the task bar). PSR produces a zip file which can be attached to this issue as a file attachment. Make sure to record enough context leading up to the issue happening so I can hopefully reproduce it here.

markm75 commented 4 years ago

log.log rotatorcommerror.zip I was able to reproduce, just rotator alone.

at first i told it to go to certain azimuth positions then hit stop.. no error. Then i decided to hit the go to home button and hit stop. This is when it broke (unsure if the third or so attempt was the ticket or just because it was in a go to home mode). I did the recording throughout the whole event. It hung things per usual. I tried to toggle the power alone, no good. Then i toggled the usb driver via my batch file and still couldnt reconnect. So the "remote" way of fixing is that i do the rotator 3.2 hex file to the rotator (uncheck shutter connected) and then i could connect and send it back to park again.

Included the recording here and the log file (began testing this around 2pm eastern in the log file, which i think is around 14:00 hours)

NameOfTheDragon commented 4 years ago

I can't find any way to make this happen on demand, but I do see the comms stack resetting periodically seemingly at random. Is it possible that there is no link between what you do and when the comms stops working?

I'm adding a "reboot" command to the firmware so you can force a reboot without having to flash the firmware. It's not great to be flashing the firmware as a recovery step (essentially what that's doing for you is rebooting the Arduino). The command is going to be @ZZR but its not in any public build just yet.

Aside: you can also reboot the Arduino by connecting to it at 1200 baud. This causes it to reset into bootloader mode, which will time out after 5 seconds.

NameOfTheDragon commented 4 years ago

@markm75 FYI

Diagnostic data suggested that the problem was being caused by the rotator not receiving (or thinking it had not received) the "heartbeat" signal from the shutter, timing out and restarting the XBee communications state machine. This can then take a minute or more to re-establish stable communications.

I put in some diagnostics to try to further analyze the problem. I also made some minor changes to the state machine itself to make it clearer why it is restarting and to get a faster startup when it does. I have had it running for hours now and not seen a single instance of the problem. So perhaps the changes have inadvertently fixed the problem.

Another factor is that there was an update to the Arduino AVR compiler recently. It's unlikely but possible that the compiler update has fixed some odd bug we may have been hitting. I wouldn't rule this out completely because I have hit other compiler and linker bugs during development of this project.

I am currently monitoring the firmware output directly with PuTTY and capturing the output to a log file. We will leave this running overnight and analyze the logs in the morning to see if the problem occurs in the logs.

NameOfTheDragon commented 4 years ago

Some progress has been made, which required spelunking into the Arduino core code. However, initial results are encouraging. Details are on the Arduino stack exchange. After further testing here I will be sending a new firmware build to a few hand selected alpha testers to see if this resolves their issues.

NameOfTheDragon commented 4 years ago

Feedback from testers suggests that this is fixed, although some occasional shutter disconnects have been reported. In my experience, radio links are prone to occasional dropouts so I'm putting this down to "normal" radio behaviour unless we get further reports of problems. This fix will be incorporated in firmware release 3.3.0