mworion / MountWizzard4

Amateur astronomy imaging support tool with special support for 10micron mounts.
Apache License 2.0
21 stars 8 forks source link

Intermittent startup Failure #342

Closed JDAstroPhoto closed 9 months ago

JDAstroPhoto commented 9 months ago

Describe the bug 70% if the time MW4 will fail to bring up the menus and dies. 30% of the time it will come up correctly with menus and function correctly.

To Reproduce This has been happening to me since I started using MW4. It isn't related to this build, but it seems to be more frequent now. I end up wasting time to keep trying to bring up MW4. It looks like when it doesn't come up it dies at qtmount.py][ 408]

I've added 2 logs, a successful log and and unsuccessful log. Please see attached. mw4-2024-01-04 -Dies.log mw4-2024-01-04 -Works.log

mworion commented 9 months ago

Hi, sorry to hear about it. I try to reproduce your issue on a win10 64 bit system python 3.10.9 running like yours. Unfortunately I cannot reproduce it myself, nor with the mouth connected neither run unconnected. The test scripts for that run each 25 times without crash.

The exit code in your log is a C related one (might be to the python or something underlying). The missing part from qtmount is simple the next steps when the mount connects to add more system information. This tells that the crash happened before mount is connected. You might try to run MW4 manually out of a terminal when cd to your workdir:

python startup.pyz

There you should see a little bit more of the output. If possible, please tell me a little bit more, how you start MW4, what does your pc system looks like, do you connect through wifi or wired ?

Michel

JDAstroPhoto commented 9 months ago

Ok, your hint on wired vs wireless pointed me in the right direction. I have two different networks that I use, one for remote imaging in Bortle 1-2 locations from my Van, the other is my home location.
I just tested my "Van network" LAN from the "SimplyNUC LLNCRFv7 Chimney Rock PorCoolPine (Windows 10)" computer to the mount and also LAN to my remote router. I also wired (LAN) from my laptop to the remote router.
I use Windows Remote Desktop from my laptop (Windows 11) to login to the SimplyNUC LLNCRFv7 Chimney Rock PorCoolPine. Tests:

  1. "Van network", Everything LAN works every time, 100% success rate.
  2. "Van network", LAN from SimplyNUC to mount, LAN from SimplyNUC to remote Router, Wireless (WIFI) to Dell Windows 11 Laptop. Works every time, 100% success.
  3. "Home network", LAN from SimplyNUC to mount, Home WIFI network from SimplyNUC to home nework, Home WIFI network from Dell Windows 11 laptop to remote login to SimplyNUC. Very large failure rate to bring up MW4.

Ok, with your help I now know what it is. It is somehow related to my home network, WIFI. I can figure it out from here. Please close this ticket as solved!
Thanks again, I really appreciate it.

mworion commented 9 months ago

Many thanks for the tests and that you found a config that works for you. With the hints you have given I try to config a setup which also brings up a high failure rate. Hopefully I will be able to find some workarounds on the application side to avoid this critical situations. I close the issue, but please add some more info in this issue thread as you move along with your experience. Michel

JDAstroPhoto commented 9 months ago

From what I gathered last night, I was able to replicate the issue. Whenever the SimplyNUC computer was on WIFI to my home network, the menus would not come up, the program would crash. When the SimplyNUC computer was LAN to my home network, it worked every time.
In both instances I used my laptop on home WIFI to Windows Remote Desktop to login to the SimplyNUC and run MW4 on SimplyNUC.
Observations:

  1. Crashes don't seem to have anything to do on whether the Laptop is WIFI or LAN to home network.
  2. Crashes happen when SimplyNUC computer (computer that is LAN to 10micron mount) is on WIFI to home network. They do not happen when SimplyNUC is on LAN to home network.

Summary:

  1. SimplyNUC has WIFI and 2 ethernet ports. 1 Gbps and 2.5 Gbps.
  2. Standard configuration is: "I ALWAYS LAN from 1 Gbps SimplyNUC to 10 micron mount."
  3. 2 choices exist to home network on SimplyNUC, WIFI or 2.5 Gbps LAN.
  4. When SimplyNUC on WIFI to home network, intermittent failures and menu crashes. Once in a while, the initial menu would come up, but when I would load my configuration file, the menus would crash.
  5. When SimplyNUC on 2.5 Gbps LAN to home network, stable, no issues.

Hypothesis:

  1. Maybe some kind of handshake timeout because the WIFI takes longer to respond from and to the SimplyNUC.
mworion commented 9 months ago

Ok, uploaded beta11. Just to check. There are reports with instabilities of the underlying TCP stack when having two networks at the same time. I tried to make it a little bit more robust. Secondly MW4 works with IP4 addresses. I changed some loopback settings to a IP4 address to omit the possibility to be switched onto IP6. Could you check if the switch between your setting also does something like switching from a IP4 to a IP6 address ? Michel

JDAstroPhoto commented 9 months ago

I have IPv6 set to "off" on my hardware firewall (Firewalla) and IPv6 interface type set to "none" on my Unifi WIFI access point. I also have IPv6 "off" on my RAID servers.

JDAstroPhoto commented 9 months ago

Michel, Did you want me to test beta11? Also, can you point me to where the beta releases are? I can only find the formal releases. Thanks

JDAstroPhoto commented 9 months ago

Never mind I found it on PYPI. Mountwizzard4 3.2.6b11

mworion commented 9 months ago

Sorry, if you enable the checkbox "show beta versions" in the internal updater, you will see them...

JDAstroPhoto commented 9 months ago

I guess I am a little confused, the directories and files for mountwizzard4-3.2.6b11 are completely different than I am used to. Attached is a screen grab of the directories and files I am used to. When I upgrade to the new official MW4, I copy off all my config files, models and images. Then I whip the directories and add the new files from the latest download. I then add in my images, config files, and models.
See attached screen grab of the directories I am used to. I then run Startup.pyz cd C:\Users\johnd\Downloads\Astronomy\MountWizzard4\startupPackage python startup.pyz --scale 3.2

I am not sure what to do with the directories I unzipped from mountwizzard4-3.2.6b11, the files and directories are all different. Standared_Release_directories

JDAstroPhoto commented 9 months ago

I ended up running "pip install mountwizzard4==3.2.6b11" I then run python startup.pyz --scale 3.2 I am not sure if I am running the b11 version, since the menu still says 3.2.6. Is there a way to tell what version I am running? I think I am still running the old one, because it is behaving exactly like last night, menus dying and not starting up with WIFI.

mworion commented 9 months ago

Hi, try to explain what happened and how to sort it out.

Hope this helps. And many thanks for testing. Michel

JDAstroPhoto commented 9 months ago

Thank-you so much for that write-up, it increased my understanding of your software and installation.

  1. I downloaded the latest startup.pyz script and 3 files.
  2. I cleaned everything up using the --clean command, it automatically installed 3.2.6
  3. I then used the MW4 menus to upgrade to mountwizzard4-3.2.6b11
  4. I verified the menus also displayed the version number mountwizzard4-3.2.6b11 Summary:
  5. No change, b11 acts exactly like 3.2.6. Menus die at startup intermittently when SimplyNUC computer is using WIFI to connect to my home network.

*****Results are exactly the same as 3.2.6*

  1. Standard configuration is: "I ALWAYS LAN from 1 Gbps SimplyNUC to 10 micron mount."
  2. 2 choices exist to home network on SimplyNUC, WIFI or 2.5 Gbps LAN.
  3. When SimplyNUC on WIFI to home network, intermittent failures and menu crashes. Once in a while, the initial menu would come up, but when I would load my configuration file, the menus would crash.
  4. When SimplyNUC on 2.5 Gbps LAN to home network, stable, no issues.
mworion commented 9 months ago

Ok, we cleaned the installation up and at least we gained some knowledge with python and applications and how to deal with them.

Unfortunately I did not find a good solution due to lake of understanding what really happens on lower layers for theses crashes as they happen in lower layers. So I stop changing things without knowing what they do (would be unprofessional). The next bigger release will move to a next major release of python and the main core libraries. It might happen that some of the issues go away without doing more special.

Michel

JDAstroPhoto commented 9 months ago

Ok, thanks for all your help, yes, I am very happy I understand now how to make it 100% reliable. I bought a 75 ft CAT 8 cable, so I will LAN to my home network 10 Gbps switch. That works every time.
When I am in the field with my Van, I will LAN (with a much shorter cable, since my chair is close by) to my remote router.

Do you recommend I go back to 3.2.6 stable release, or continue with 3.2.6b11?

mworion commented 9 months ago

There is no known issue with the beta. But it just a short testing time.Am 06.01.2024 um 20:12 schrieb JDAstroPhoto @.***>: Ok, thanks for all your help, yes, I am very happy I understand now how to make it 100% reliable. I bought a 75 ft CAT 8 cable, so I will LAN to my home network 10 Gbps switch. That works every time. When I am in the field with my Van, I will LAN (with a much shorter cable, since my chair is close by) to my remote router. Do you recommend I go back to 3.2.6 stable release, or continue with 3.2.6b11?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you modified the open/close state.Message ID: @.***>

JDAstroPhoto commented 9 months ago

FYI, Before I wrote this trouble report, I tried many different Python versions in the last couple of years to try and solve this intermittent startup/menu failure. See attached for all the python versions I have tried, none of them solved the intermittent failures. Python

JDAstroPhoto commented 9 months ago

I am starting to believe it has nothing to do with WIFI or LAN. I was able to reproduce the failures with LAN, but they are much less frequent, but when they start, it is much harder to get a good startup. I cleaned out Python and reloaded 3.10.9 Python and reloaded MW4. When I run MW4, I get two instances of Python running in task manager and 1 instance of MW4. If I end the 15MB Python task, both Python tasks disappear from the Task Manager, however MW4 is still up and running.
If I delete the smaller version, MW4 and python disappear, this includes the MW4 menus. On startup and after the menus are up, should there by 2 instances of Python and MW4 in the Task Manager?