machinekit / QtQuickVcp

A Virtual Control Panel for Machinekit written in Qt/C++/QML
Other
128 stars 74 forks source link

Machinekit Client continuously disconnects and reconnects #151

Closed einstine909 closed 7 years ago

einstine909 commented 7 years ago

After launching a machinekit instance and choosing an interface (Cetus or Machineface) with Machinekit Client, the client is always disconnecting and reconnecting every few seconds.

I am running MachinekitClient_Development-201703061505-master-0aefbc6-x64.AppImage as the client and 0.1.1488973272-1mk.travis.master.git187524c9~1wheezy is the machinekit package I currently have installed.

/var/log/linuxcnc.log or the syslog did not show any messages as the disconnects or reconnects were happening. But running mkwrapper with the -d flag did report (what I believe to be) interesting messages (truncated to the a section during the disconnects and reconnects) :

... process command called, id: ['\x00\xc9\x16M\xd2'] sending config message process status called task True status service subscribed: True process error called error True error service subscribed: True sending task message process error called text True error service subscribed: True process command called, id: ['\x00\xc9\x16M\xd2'] process command called, id: ['\x00\xc9\x16M\xd2'] process status called interp True status service subscribed: True sending interp message process status called io True status service subscribed: True process command called, id: ['\x00\xc9\x16M\xd2'] sending io message process status called motion True status service subscribed: True process error called display True error service subscribed: True sending motion message process status called config True status service subscribed: True process error called error True error service subscribed: True sending config message process status called task True status service subscribed: True process error called text True error service subscribed: True sending task message process command called, id: ['\x00\xc9\x16M\xd2'] process command called, id: ['\x00\xc9\x16M\xd2'] process command called, id: ['\x00\xc9\x16M\xd2'] process status called interp True status service subscribed: True sending interp message ...

This is running over a wired network and I have no networking issues between the client machine and the BBB.

This issue seemed to pop up when I updated Machinekit Client and the machinekit package a few days ago. Unfortunately I did not record what versions of the client and BBB machinekit package I had in the working configuration...

If there is any other useful information let me know. I will keep familiarizing myself with how mkwrapper and Machinekit Client interact in the meantime.

machinekoder commented 7 years ago

I have noticed something similar today with a recent version of Machinekit. Can you maybe create a screen recording of the disconnects? It would be interesting to see which parts of the UI disconnect.

Also, do you see any differences between Unicast and Multicast mode?

machinekoder commented 7 years ago

Can you maybe try an older version of MachinekitClient (from 2-3 weeks ago) and let me know if you see any differences. https://dl.bintray.com/machinekoder/MachinekitClient-Development/

einstine909 commented 7 years ago

I forgot to mention that I tried both unicast and multicast, both have the same symptoms.

I just attempted with MachinekitClient_Development-201702081841-master-3577f18-x64.AppImage and got the same results. It seems that all screens disconnect.

I will make a screen recording right now.

einstine909 commented 7 years ago

Here is the screen capture. I was planning on showing machineface as well, but hit shutdown instead of disconnect...

https://youtu.be/d_AP9VkylW8

machinekoder commented 7 years ago

I can only reproduce this problem in one configuration that uses haltalk and mkwrapper. Do you also use both?

einstine909 commented 7 years ago

I also use both.

machinekoder commented 7 years ago

Which network interface do you use? Can you please post a ìp a form your BBB.

machinekoder commented 7 years ago

Maybe also run top to see if any process is using up CPU time.

einstine909 commented 7 years ago

Here is the top of top (ha):

 3213 machine+  20   0   22200  20484  14832 S 11.3  4.0  30:44.68 milltask                                                                                                                                                                                                    
 3363 machine+  20   0    4704   1324    860 R  4.9  0.3   0:20.05 top                                                                                                                                                                                                         
 2461 machine+  20   0    8764   7416   6672 S  4.0  1.5  11:29.42 halui                                                                                                                                                                                                       
 3214 machine+  20   0   81228  27476   5172 S  1.8  5.4  25:28.59 mkwrapper                                                                                                                                                                                                   
  742 avahi     20   0    3412   1548   1112 S  1.2  0.3   0:47.06 avahi-daemon                                                                                                                                                                                                
 1653 machine+  20   0   54028  10716   8188 S  1.2  2.1   2:50.21 lxpanel                                                                                                                                                                                                     
 3220 machine+  20   0   71696  26440   4348 S  1.2  5.2   3:53.17 mkwrapper                                                                                                                                                                                                   
 3355 machine+  20   0   10912   1536    740 S  0.9  0.3   0:03.12 sshd 

And here is ip a:

machinekit@beaglebone:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,DYNAMIC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether d0:5f:b8:e3:f9:e3 brd ff:ff:ff:ff:ff:ff
    inet 192.168.2.60/24 brd 192.168.2.255 scope global eth0
    inet6 fe80::d25f:b8ff:fee3:f9e3/64 scope link 
       valid_lft forever preferred_lft forever
3: usb0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
    link/ether d0:5f:b8:e3:f9:e5 brd ff:ff:ff:ff:ff:ff
    inet 192.168.7.2/30 brd 192.168.7.3 scope global usb0
4: can0: <NOARP,ECHO> mtu 16 qdisc noop state DOWN group default qlen 10
    link/can 
5: can1: <NOARP,ECHO> mtu 16 qdisc noop state DOWN group default qlen 10
    link/can
einstine909 commented 7 years ago

I just tried your mkwrapper-sim config and it does not show this problem. I am going to build up my config using that as an example. I will add hal components until something causes this to happen again. Will report back with what I find.

einstine909 commented 7 years ago

I found the issue. Specifying a base_period_nsec other than zero for motmod causes this. I am not sure if this is a red herring, or the actual root cause...

machinekoder commented 7 years ago

Good that you found the problem, bad that it exists. Can you please create an issue on the Machinekit issue tracker describing how to reproduce the problem.

luminize commented 7 years ago

I also see disconnects and connects of the remote component I use with pymachinetalk. What I noticed until I introduced a queue in my python code was that when executing long code blocks, the remote component disconnected. This improved a lot by breaking code into smaller pieces, and having Canopen or remote component callbacks add functions to the queue, instead of executing the code immediately. So my feeling is that there could be something that blocks.

propcoder commented 7 years ago

I can confirm the same or very similar problem with mk client on Windows. I tried to run mk client on Windows 10, 64bit. I tried 64bit version first. It finds the client, shows app instances, starts them and.. says 'exited'. Then when I repeat, the instance is already running. Then I select application, but everything stops at waiting for services halrcomp and halrmcd. Then I tried 32bit version. It behaves similar - shows 'exited' at first, and then it connects normally. But about 10..20 seconds later it displays Connecting and waits for halrcomp and halrcmd forever.

I tried this on my Win10 64bit VM - and got the same results. Any suggestions?

sirop commented 7 years ago

@propcoder UPD port 5353 opened on windows? What is the output of the Service Window (local UIs)? https://github.com/qtquickvcp/QtQuickVcp/issues/180#issuecomment-318272122 ?

propcoder commented 7 years ago

Windows Firewall is off, no other firewalls installed.

There are no Services after client disconnects. And after going Back it says Exited, but later I see that the instance is running and I can connect to it again.

Available services also dissappear after some (about 30s) time if I keep service window open.

Checked again with both amd64 and x86 - result is the same.

sirop commented 7 years ago

@propcoder I just checked it on my Win 8.1: the firewall is said to be off, but at least the inbound connections do obey the firewall rules nevertheless.

propcoder commented 7 years ago

tmp

sirop commented 7 years ago

@propcoder Do not know then. Is your MK client the latest version?

propcoder commented 7 years ago

Yes, it is the latest.. I tried it even with the latest libzmq.dll which is the only file in MachinekitClient_Development-201708040658-master-1122f84-x64.zip

sirop commented 7 years ago

@propcoder I have seen lately some differences between Remote and Local UI performance.

For Local UIs I used my own build of QtQuickVCP on Win 8.1 x64. That worked without problems.

propcoder commented 7 years ago

I could try this. Is it possible / how to add local UI files without recompiling?

On 4 August 2017 at 15:08, Boris Skegin notifications@github.com wrote:

@propcoder https://github.com/propcoder I have seen lately some differences between Remote and Local UI performance.

For Local UIs I used my own build of QtQuickVCP on Win 8.1 x64. That worked without problems.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/qtquickvcp/QtQuickVcp/issues/151#issuecomment-320244262, or mute the thread https://github.com/notifications/unsubscribe-auth/ACyR7hOzLLPXRuVXBYLiPYVhZGtA5nrJks5sUxfCgaJpZM4MYedN .

sirop commented 7 years ago

Is it possible / how to add local UI files without recompiling?

https://github.com/qtquickvcp/QtQuickVcp/tree/master/apps shows that you get your own MK client, if you build QtQuickVcp. Then if you buidl and run your own QML app linking it to QtQuickVcp built before, your QML app starts locally within the MK client.

machinekoder commented 7 years ago

Let me know if https://github.com/qtquickvcp/QtQuickVcp/pull/186 and https://github.com/qtquickvcp/QtQuickVcp/pull/187 fix the problems you describe.

machinekoder commented 7 years ago

@sirop Compiling QtQuickVcp from source does not change anything except that you are running with libs compiled on your computer. However, the amount of work needed to get it compiling on Windows drastically outweighs the benefits in my opinion - unless you are planning to contribute to the project of course.

sirop commented 7 years ago

@machinekoder Too late, I have set up the build env already.

How do you start a local UI with MK client? With local I mean the QML code being on the same machine as the MK client.

machinekoder commented 7 years ago

@sirop Download the Machineface or Cetus code to your computer and open the .pro file in QtCreator. Don't forget to setup the install steps for the QtQuickVcp project

luminize commented 7 years ago

On 9 Aug 2017, at 10:43, Boris Skegin wrote: How do you start a local UI with MK client? With local I mean the QML code being on the same machine as the MK client.

@sirop the UI you make in QtCreator you can also build. This compiles into a program that you can start locally. You also must specify the host name and for the service discovery it connects to. So in main.qml i had to do this:

ConnectionWindow {
id: connectionWindow
anchors.fill: parent
defaultTitle: "tubeturner"
autoSelectInstance: true
autoSelectApplication: true
mode: "local"
applications: [
ApplicationDescription {
sourceDir: "qrc:/tubeturner.tubeturner-UI/"
}
]
instanceFilter: ServiceDiscoveryFilter{ name: "1609101.local" }
}
sirop commented 7 years ago

@machinekoder @luminize Nice hint about ServiceDiscoveryFilter .

But anyway I have to build QtQuickVCP locally for that, which I actually do.

matatasb commented 7 years ago

I was having some troubles with "waiting services" with a Virtual Machine on the same PC with Windows 10. I tried a few solutions but they didn't work, after trying today with the new MachineClient. The last version I tried was MachinekitClient_Development-201707270842-master-eb599d1-x86. Probably #186 and #187 fixed the problem. Thank you machinekoder!

machinekoder commented 7 years ago

@matatasb Great to hear. Then I will close this issue.