kpeeters / cadabra2

A field-theory motivated approach to computer algebra.
https://cadabra.science/
GNU General Public License v3.0
230 stars 37 forks source link

UI frontend client fails to connect to the server on FreeBSD #12

Closed yurivict closed 6 years ago

yurivict commented 8 years ago

Client keeps restarting the server, every time prints this:

cadabra-client: spawning server
PREPARSED:
 import sys
server=0
def setup_catch(cO, cE, sE):
   global server
   sys.stdout=cO
   sys.stderr=cE
   server=sE

PREPARSED:
 import imp; f=open(imp.find_module('cadabra2_defaults')[1]); code=compile(f.read(), 'cadabra2_defaults.py', 'exec'); exec(code); f.close()

cadabra-client: connect done
cadabra-client: connection failed

System calls log for each cycle:

67782: _umtx_op(0x80076d0e0,UMTX_OP_WAKE_PRIVATE,0x7fffffff,0x0,0x0) = 0 (0x0)
cadabra-client: connect done67787: mmap(0x0,4194304,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34577842176 (0x80d000000)
67782: write(2,"cadabra-client: connect done",28) = 28 (0x1c)

67787: mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34368212992 (0x800815000)
67782: write(2,"\n",1)                           = 1 (0x1)
67782: kevent(11,0x0,0,{ 12,EVFILT_READ,EV_CLEAR,0x0,0x1,0x812c260bc },128,{ 0.000000000 }) = 1 (0x1)
67782: close(35)                                 = 0 (0x0)
67782: kevent(11,{ 12,EVFILT_READ,EV_ADD|EV_CLEAR,0x0,0x0,0x812c260bc },1,0x0,0,0x0) = 0 (0x0)
67782: socket(PF_INET,SOCK_STREAM,6)             = 35 (0x23)
67782: setsockopt(0x23,0xffff,0x800,0x7fffdfffc974,0x4) = 0 (0x0)
67782: ioctl(35,0x8004667e { IOW 0x66('f'), 126, 4 },0xdfffc964) = 0 (0x0)
67782: connect(35,{ AF_INET 127.0.0.1:20914 },16) ERR#36 'Operation now in progress'
67782: kevent(11,{ 35,EVFILT_WRITE,EV_ADD|EV_CLEAR,0x0,0x0,0x812c40200 },1,0x0,0,0x0) = 0 (0x0)
67782: kevent(11,0x0,0,{ 12,EVFILT_READ,EV_CLEAR,0x0,0x1,0x812c260bc 35,EVFILT_WRITE,EV_CLEAR|EV_EOF,NOTE_LOWAT|0x3c,0x8000,0x812c40200 },128,{ 0.000000000 }) = 2 (0x2)
67782: poll({ 35/POLLOUT },1,0)                  = 1 (0x1)
67782: getsockopt(0x23,0xffff,0x1007,0x7fffdfffccec,0x7fffdfffccfc) = 0 (0x0)
67782: getpeername(35,0x7fffdfffcdb0,0x7fffdfffcde0) ERR#57 'Socket is not connected'
67782: kevent(11,{ 12,EVFILT_READ,EV_ADD|EV_CLEAR,0x0,0x0,0x812c260bc },1,0x0,0,0x0) = 0 (0x0)
67782: shutdown(35,SHUT_RDWR)                    ERR#54 'Connection reset by peer'
cadabra-client: connection failed67782: write(2,"cadabra-client: connection faile"...,33) = 33 (0x21)
yurivict commented 8 years ago

It seems to me that the client fails to properly wait for the connection event to complete and just terminates the connection.

kpeeters commented 8 years ago

Will have to investigate this in a VM; I am using websocketpp for communication between client and server, so any bug on freebsd is likely to be present already upstream.

yurivict commented 8 years ago

Yes, please investigate.

Here is my port submission: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210476 Just extract it into the ports tree (cd /usr/ports && sh shar.sh), set GUI option (cd math/cadabra2 && make config), remove BROKEN statement there, and install (make install clean).

This might be a FreeBSD-only bug, because it uses kqueue/kevent only on FreeBSD.

One thing to try is to update websocketpp to the latest version - this 'might' help.

yurivict commented 8 years ago

I tried -DBOOST_ASIO_DISABLE_KQUEUE, but the problem is still there with selects instead of kevents.

I also tried replacing websocketpp-0.6.0 with websocketpp-0.7.0, and this also didn't solve the problem.

EINPROGRESS condition from connect(2) in websocketpp isn't processed properly.

yurivict commented 8 years ago

@kpeeters Are you able to provide a small, standalone example code for cadabra2 client/server communication that is easier to look at?

kpeeters commented 8 years ago

I have nothing simpler than the code in the client_server folder.

If you have pinned this down to EINPROGRESS not being handled properly by websocketpp, I would think that by far the best (and least time-consuming) resolution is to raise that directly with the author of websocketpp.

zaphoyd commented 8 years ago

Is there any chance it would be possible to get a debug level output from websocketpp's built in logging system when this happens?

yurivict commented 8 years ago

Yes. Are there instructions how to do this?

zaphoyd commented 8 years ago

General logging reference is here for context: https://docs.websocketpp.org/reference_8logging.html

The debug logging level is a performance hit so it isn't compiled into the default configs. The likely changes would be:

https://github.com/kpeeters/cadabra2/blob/master/client_server/Server.hh#L7 include the file with the debug config definition <websocketpp/config/debug_asio_no_tls.hpp>

https://github.com/kpeeters/cadabra2/blob/master/client_server/Server.hh#L64 Change websocketpp::config::asio to websocketpp::config::debug_asio to switch to the config that has debug log level

https://github.com/kpeeters/cadabra2/blob/master/client_server/Server.cc#L547-L548 Set these to set_access_channels and set_error_channels rather than clear

By default output will go to standard out. If that is a problem for this app it can also be redirected to a file.

yurivict commented 8 years ago

When I changed the lines you suggested, it doesn't cycle any more. Now it complains about uri:

cadabra-client: spawning server
cadabra-client: websocket connection error invalid uri
PREPARSED:
 import sys
server=0
def setup_catch(cO, cE, sE):
   global server
   sys.stdout=cO
   sys.stderr=cE
   server=sE

The original behavior was that it cycled:

$ cadabra-gtk 
cadabra-client: spawning server
PREPARSED:
 import sys
server=0
def setup_catch(cO, cE, sE):
   global server
   sys.stdout=cO
   sys.stderr=cE
   server=sE

PREPARSED:
 import imp; f=open(imp.find_module('cadabra2_defaults')[1]); code=compile(f.read(), 'cadabra2_defaults.py', 'exec'); exec(code); f.close()

(cadabra-gtk:12053): GVFS-RemoteVolumeMonitor-WARNING **: invoking List() failed for type GProxyVolumeMonitorHal: GDBus.Error:org.freedesktop.DBus.Error.UnknownMethod: Method List is not implemented on interface org.gtk.Private.RemoteVolumeMonitor (g-dbus-error-quark, 19)
cadabra-client: connect done
cadabra-client: connection failed
cadabra-client: spawning server

The patches I applied are:

--- client_server/Server.hh.orig        2016-06-20 20:19:43 UTC
+++ client_server/Server.hh
@@ -5,6 +5,8 @@
 #include <boost/uuid/uuid.hpp>
 #include <websocketpp/server.hpp>
 #include <websocketpp/config/asio_no_tls.hpp>
+#include <websocketpp/config/debug_asio.hpp>
+#include <websocketpp/logger/syslog.hpp>
 #include <websocketpp/common/functional.hpp>
 #include <future>
 #include <boost/python.hpp>
@@ -61,7 +63,7 @@ class Server {
                void init();

                // WebSocket++ dependent parts below.
-               typedef websocketpp::server<websocketpp::config::asio> WebsocketServer;
+               typedef websocketpp::server<websocketpp::config::debug_asio> WebsocketServer;
                void on_socket_init(websocketpp::connection_hdl hdl, boost::asio::ip::tcp::socket & s);
                void on_message(websocketpp::connection_hdl hdl, WebsocketServer::message_ptr msg);
                void on_open(websocketpp::connection_hdl hdl);
--- client_server/Server.cc.orig        2016-06-20 20:19:43 UTC
+++ client_server/Server.cc
@@ -534,8 +534,8 @@ void Server::on_kernel_fault(Block blk)
 void Server::run() 
        {
        try {
-               wserver.clear_access_channels(websocketpp::log::alevel::all);
-               wserver.clear_error_channels(websocketpp::log::elevel::all);
+               wserver.set_access_channels(websocketpp::log::alevel::all);
+               wserver.set_error_channels(websocketpp::log::elevel::all);

                wserver.set_socket_init_handler(bind(&Server::on_socket_init, this, ::_1,::_2));
                wserver.set_message_handler(bind(&Server::on_message, this, ::_1, ::_2));
zaphoyd commented 8 years ago

Looks like github ate part of the directions:

#include <websocketpp/config/debug_asio_no_tls.hpp>

is the only additional header needed

That said, that change shouldn't have any effect on URI validation. What is the URI in question?

Also, is there a document anywhere that details the steps to build/reproduce on a fresh freebsd install? (I see instructions for various Linuxes only).

yurivict commented 8 years ago

It still doesn't cycle with #include <websocketpp/config/debug_asio_no_tls.hpp>

To reproduce on FreeBSD 10.3:

1. cd /usr/ports/math/cadabra2
2. make config
--> choose GUI option in the menu
3. make install
4. cadabra-gtk

Perform steps 1..3 as root, step 4 as a regular user. Step 3 takes a long time due to dependencies build. You can speed it up by first running pkg install cadbra2 && pkg delete cadabra2. This will install dependencies from binaries.

yurivict commented 8 years ago

Invalid URI is ws://127.0.0.1:0

yurivict commented 8 years ago

Confirming that this quick patch into boost solves the problem. EINPROGRESS mishandling is indeed the cause of the problem.

--- /usr/local/include/boost/asio/detail/impl/socket_ops.ipp       2016-10-07 21:39:30.532613000 -0700
+++ /usr/local/include/boost/asio/detail/impl/socket_ops.ipp    2016-10-07 21:27:56.976331000 -0700
@@ -469,7 +469,16 @@
 inline int call_connect(SockLenType msghdr::*,
     socket_type s, const socket_addr_type* addr, std::size_t addrlen)
 {
-  return ::connect(s, addr, (SockLenType)addrlen);
+  int res = ::connect(s, addr, (SockLenType)addrlen);
+  if (res == -1 && errno==EINPROGRESS) {
+    fd_set write_fd;
+    FD_ZERO(&write_fd);
+    FD_SET(s, &write_fd);
+    ::select (s+1, NULL, &write_fd,NULL,NULL);
+    res = 0;
+  }
+  return res;
 }

 int connect(socket_type s, const socket_addr_type* addr,

I am not too familiar with boost. Any idea how EINPROGRESS is supposed to be handled? It's hard to believe that boost has such a bug because it is quite widely used.

kpeeters commented 8 years ago

I now have a FreeBSD 11 virtual machine available for testing/experimenting. As I'm a total FreeBSD novice, would you mind jotting down the recommended way of installing all prerequisites for compiling cadabra from github on FreeBSD? I will then have another look at getting this problem sorted. Many thanks!

yurivict commented 8 years ago

Just do this:

cd /usr/ports/math/cadabra2
make install
# wait while it builds/installs everything

Make sure the version in Makefile says 2.0.930, otherwise you need to update ports. Just in case, ports are checked out with svn co http://svn.freebsd.org/ports

Please note that all the above problems were patched and this port works fine now, as far as I can tell anyway.

This port actually goes a long way to patch these. It overrides the socket_ops.ipp from boost to process EINPROGRESS properly, and overrides endpoint.hpp from websockets to patch v6->v4.

yurivict commented 8 years ago

The original problem of this bug is caused by the bug in websockets https://github.com/zaphoyd/websocketpp/issues/587

Websockets wrongly assume that v6 to v4 mapping always works, and defined default API methods under this assumption. IMO, there is no need to assume anything about v6/v4 protocols on the websockets level. Users should listen for and connect to the proper or desired protocol versions. So this should be fixed in websockets, and this bug here should be closed.

kpeeters commented 8 years ago

Yes, but since I am distributing websockets along with cadabra, I may as well patch it so it works for us.

yurivict commented 8 years ago

Then you should ensure it only uses v4 locally.

kpeeters commented 6 years ago

I know you reported this ages ago, but I have just patched the websocketpp inside cadabra (the two 'listen' functions). It's in the feature/kwindows branch for now, will soon go into master. I have tried on OpenBSD and this fix works there.

yurivict commented 6 years ago

I've patched this in the FreeBSD port, so it works now. I still have to try the latest version to see if the other problem is gone.