ogata0916 / mozc

Automatically exported from code.google.com/p/mozc
0 stars 0 forks source link

lack of mozc_server termination treatment #40

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
version: mozc-0.13.464.102
target file: server/mozc_server.cc
category: lack of feature
OS: FreeBSD 9-CURRENT / 8.1-RELEASE
CC: gcc 4.2.1 (FreeBSD default compiler)

server/mozc_server.cc needs termination treatment for FreeBSD and other
OSes. The 'abstract namespace' is Linux specific feature. So for any 
other OSes like FreeBSD, termination treatment is needed to delete 
unix socket file and remove user lock files.

I guess to add signals (SIGINT, SIGHUP and SIGTERM) and call 
g_session_server->Terminate() is good way.

Original issue reported on code.google.com by daichig...@gmail.com on 20 Sep 2010 at 3:52

GoogleCodeExporter commented 9 years ago
We've fixed this issue. Basically, Mozc client library automatically detects 
unused IPC path and remove it before launching new server. Unfortunately, we've 
not verified it on FreeBSD environment. I would appreciate it if you test the 
new package on FreeBSD. Thanks.

Original comment by t...@google.com on 22 Sep 2010 at 12:20

GoogleCodeExporter commented 9 years ago
It's not fixed yet on FreeBSD (and maybe not fixed yet for other OSes). Follow 
files are kept remaining after mozc_server killed and mozc client library does 
not remove it before launching new server.

  ~/.mozc/.server.lock
  ~/.mozc/.session.ipc
  /tmp/.mozc.6faadb9dde298a1b6eb50e982d8c8be2.session   (name will change every time)

Radically, server/mozc_server.cc needs the code to call 
g_session_server->Terminate() when closing for termination treatment for POSIX 
compatible OS. For example, add a function like follow to server/mozc_server.cc
 file.

static void sig_func(int num)
{
  switch (num) {
  case SIGINT:
  case SIGHUP:
  case SIGTERM:
    if (g_session_server)
      g_session_server->Terminate();
    break;
  default:
    break;
  }
}

And insert code like follow to main function of server/mozc_server.cc.

    ::signal(SIGINT, sig_func);
    ::signal(SIGHUP, sig_func);
    ::signal(SIGTERM, sig_func);

It works well and it looks very reasonable treatment for POSIX base OS. The 
'abstract namespace' is Linux specific feature. For POSIX base OS, signal 
treatment for closing socket is needed.

Original comment by daichig...@gmail.com on 25 Sep 2010 at 3:36

GoogleCodeExporter commented 9 years ago
It is expected behaviour that these files (~/.mosc/.server.lock, 
~/.mozc/.session.ipc, /tmp/.mozc...) remain.  Could you check that mozc_server 
is launched even when these files remain?

We basically don't like to apply your proposal because of the following reasons.
1. Mozc converter is not POSIX signal safe and calling 
g_session_server->Terminate() asynchronously may cause a critical problem. We 
cannot check that all functions in Mozc are using async-signal-safe functions. 
Strictly speaking, we cannot call malloc,printf,fopen in signal handler, since 
they are not async-signal-safe.
2. On Linux/Mac, SessionServer::Terminate() executes pthread_cancel, which 
kills running thread. It is not guaranteed all resources SessionServer is 
managing are gracefully closed. On Windows, SessionServer::Terminate() raises a 
control event so SessionServer can finish the select loop gracefully. Calling 
SessionServer::Terminate() is not well tested except for Windows.
3. We think that the cause of this problem is a bug of error detection/handling 
of SessionHandler.

The current code tries to unlink /tmp/.mozc.*.session, when ::connect() returns 
non 0 (error).  As long as the current IPC file is removed, client library 
attempts to restart new server. Please check that errorno is either ENOTSOCK or 
ECONNREFUSED.

ipc/unix_ipc.cc: line 326
    if (::connect(socket_,
                  reinterpret_cast<const sockaddr*>(&address),
                  sun_len) != 0 ||
        !IsPeerValid(socket_, &pid)) {
      if ((errno == ENOTSOCK || errno == ECONNREFUSED) &&
          !IsAbstractSocket(server_address)) {
        // If abstract namepace is not enabled, recreate server_addresss path.
        ::unlink(server_address.c_str());
      }

Original comment by t...@google.com on 27 Sep 2010 at 6:50

GoogleCodeExporter commented 9 years ago
mozc_server cannot get started with those files (~/.mosc/.server.lock, 
~/.mozc/.session.ipc, /tmp/.mozc...). 
And errorno looks like ECONNREFUSED.

Attached log files show you what happened. 
situation:
  run mozc_server
  kill -9 mozc_server
  remain files:
    ~/.mosc/.server.lock
    ~/.mozc/.session.ipc
    /tmp/.mozc...

work:
  rm ~/.mozc/*log
  ibus-daemon -r --daemonize --xim
  Ctrl-space
    ibus-mozc says that failed of starting up mozc_server
  Click OK
  mv ~/.mozc/*log ~
  attached those files here :)

Is there anything I can do for your investigation??

Original comment by daichig...@gmail.com on 27 Sep 2010 at 9:19

Attachments:

GoogleCodeExporter commented 9 years ago
Thank you for the debug log.

It seems that the original cause of this issue is not in IPC (abstract 
namespace) but in the different behavior of fcntl().
Mozc  server first creates ~/.mozc/.session.ipc and stores IPC path in it. Mozc 
also locks the file via fcntl, meaning that this file acts as a kind of lock 
file to prevent multiple server instantiations. 

On Linux, (and maybe on Mac as well), file lock is gracefully cleaned if the 
process is killed. However, it seems that file lock still remains even on 
FreeBSD. This is not an expected behavior for Mozc.  I also think this 
situation won't be ideal, as fcntl() returns wrong result. Anyway, please take 
a look at the following part. What the errno looks like when the error happens?

base/process_mutex.cc

   const int result = ::fcntl(*fd, F_SETLK, &command);
    if (-1 == result) {  // failed
      ::close(*fd);
      LOG(WARNING) << "already locked";
      return false;   // another server is already running
    }

Original comment by t...@google.com on 1 Oct 2010 at 4:29

GoogleCodeExporter commented 9 years ago
fcntl() returns 35 [EAGAIN, Resource temporarily unavailable] when the error 
happens. Check code with as follow:

    const int result = ::fcntl(*fd, F_SETLK, &command);
    if (-1 == result) {  // failed
      ::close(*fd);
      LOG(WARNING) << "already locked";
      LOG(WARNING) << "already locked: " << errno;
      LOG(WARNING) << "already locked: " << ::strerror(errno);
      return false;   // another server is already running
    }

Log file is attached.

Original comment by daichig...@gmail.com on 3 Oct 2010 at 5:04

Attachments:

GoogleCodeExporter commented 9 years ago
Hmm, EAGAIN means that the file is locked by another process. I'm not sure why 
FreeBSD returns EAGAIN in this situation.

please check the status of file locking with F_GETLK flag. What the l_pid looks 
like?    If the pid is the same as the process id of previous mozc_server, the 
behavior of fcnt should be strange.

fcntl(*fd, F_GETLK, &command);
LOG(WARNING) << command.l_pid;

According to 
http://www.jp.freebsd.org/cgi/mroff.cgi?sect=2&cmd=&lc=1&subdir=man&dir=jpman-5.
2.0/man&subdir=man&man=fcntl
l_pid may be set -1.

Original comment by t...@google.com on 3 Oct 2010 at 10:11

GoogleCodeExporter commented 9 years ago
Yeah.... it looks very strange.

% cat mozc_server.log 
Log file created at: 2010-10-04 09:27:31 45266 34405905152
Program name: mozc_server
2010-10-04 09:27:31 45266 34405905152 LOG(WARNING) base/process_mutex.cc(181) 
-6464
2010-10-04 09:27:31 45266 34405905152 LOG(WARNING) base/process_mutex.cc(184) 
already locked
2010-10-04 09:27:31 45266 34405905152 LOG(INFO) server/mozc_server.cc(106) Mozc 
Server is already running
% 

It's always -6464.

Original comment by daichig...@gmail.com on 4 Oct 2010 at 3:28

GoogleCodeExporter commented 9 years ago
I am doubting that fcntl(2) removes lock file after mozc_server process killed. 
According to FreeBSD fcntl(2) manual:

     All locks associated with a file for a given process are removed when the
     process terminates.

All locks will be removed when process killed. But there is no explanation 
around lock file removing itself.  If fcntl(2) does not remove lock file 
itself, mozc_server will need another way around lock treatment for another OS 
like FreeBSD.

Original comment by daichig...@gmail.com on 5 Oct 2010 at 12:47

GoogleCodeExporter commented 9 years ago
I got it!  FreeBSD developer, Garrett Cooper helped me to find the difference 
of fcntl(2) behavior.  FreeBSD fcntl(2) will not return correct l_pid called 
with F_SETLK. When called with F_GETLK, l_pid has correct process id.  And 
F_GETLK way works for Linux as well.

Please read attached file, I guess that way is great for solution. (attached 
file is created by Garrett Cooper. Thanks Garrett!)

Original comment by daichig...@gmail.com on 5 Oct 2010 at 6:45

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks for the investigation, but it is not clear for me how to fix the 
original problem.

Original comment by t...@google.com on 13 Oct 2010 at 10:46

GoogleCodeExporter commented 9 years ago
I am going to find another good solution for mozc. Wait some weeks :)

Original comment by daichig...@gmail.com on 13 Oct 2010 at 2:55

GoogleCodeExporter commented 9 years ago

Original comment by yukawa@google.com on 1 Apr 2012 at 2:16