Open GoogleCodeExporter opened 9 years ago
Which emacs are you using--the one that's bundled with Mac OS X
(/usr/bin/emacs)?
What's the "ls -l" output for the remote file (that is, "ls -l
/tmp/volname/file")?
What's the "ls -l" output for /local/data on the remote machine?
What's your umask settings on the local (Mac OS X) and the remote (Linux)
machines?
Original comment by si...@gmail.com
on 11 Jun 2007 at 3:35
Emacs: fink emacs 21.2.1 (X Windows)
umask: 002 (on both sides)
After mounting, file listing running from Mac OS X side:
> ls -la /tmp/volume/dir1/dir2/dir3/
total 9568
drwxrwsr-x 1 user 700 4096 Jun 10 23:21 .
drwxrwsr-x 1 user 700 4096 Apr 22 02:50 ..
-rw-rw-r-- 1 user 700 7967 Apr 22 05:01 file1
-rw-rw-r-- 1 user 700 2585 Jan 22 14:25 file2
... and so on ...
File listing executed on Linux (remote) side:
> cd /local/data/dir1/dir2/dir3/
> ls -la
total 4892
drwxrwsr-x 14 user group 4096 Jun 10 23:21 .
drwxrwsr-x 13 user group 4096 Apr 22 02:50 ..
-rw-rw-r-- 1 user group 7967 Apr 22 05:01 file1
-rw-rw-r-- 1 user group 2585 Jan 22 14:25 file2
... and so on ...
Original comment by cbmarkwa...@gmail.com
on 11 Jun 2007 at 5:07
I should say that while emacs is editing the file (say it is called "newfile"),
it
correctly creates the symlink lock-file ".#newfile -> user@hostname.local.4527".
When attempting to save the file, it *does* save the file (or the auto-save file
depending on how long I pause). Right after that time, the mount becomes
"disconnected."
The sshfs and sftp processes continue to run despite all this.
Original comment by cbmarkwa...@gmail.com
on 11 Jun 2007 at 5:14
Can you try /usr/bin/emacs and see if that has the same behavior?
Original comment by si...@gmail.com
on 11 Jun 2007 at 5:26
/usr/bin/emacs (21.2.1) functions OK.
It doesn't have X-windows though, which is why I use the Fink version.
Original comment by cbmarkwa...@gmail.com
on 11 Jun 2007 at 5:38
If I try Fink emacs-X11 with sshfs connected to a Mac OS X remote machine, it
works fine. If the remote
system is Linux (2.6.18.5 kernel, OpenSSH_4.5p1, OpenSSL 0.9.8a), then Fink
emacs-X11 saves the file but
claims that an error occurred--as you reported. However, in my case, the mount
doesn't "disconnect" or hang
otherwise--things on the volume are accessible as before.
Anyway, I will look at this more when I get time--can't make this a priority
issue right now. Meanwhile, if you
could narrow this down within emacs (what exactly emacs is doing that causes
this, and what exactly is the
error that emacs thinks has occurred), that will be helpful.
It is quite likely that this issue/behavior has the same cause as issue 114.
Original comment by si...@gmail.com
on 11 Jun 2007 at 6:09
I just did a "ktrace" of the offending emacs process (fink emacs-X11).
It looks like emacs sets an "alarm" signal which has a very short duration,
shorter than the time it takes for the remote sshfs can respond. This
results in an endless cycle of restarted system calls, each one being
interrupted by an alarm.
I can't tell if this is emacs behavior or libc, but I have some evidence that
it is related to how emacs interacts with X-windows. Reference: <a
href="http://osdir.com/ml/emacs.bugs/2002-10/msg00070.html">post</a>.
If it's a network timing difference, that might explain why it works
differently for
you and me.
From what I can see, each syscall gets interrupted by an alarm before it can
complete, which causes a set of rapid-fire requests to the remote sshfs process.
Original comment by cbmarkwa...@gmail.com
on 11 Jun 2007 at 7:33
Thanks--that's likely to be useful information.
Original comment by si...@gmail.com
on 11 Jun 2007 at 7:58
Try this:
Take the following code, and compile it as a dynamic shared library.
== cut here ==
// libsetitimer.c
#include <stdio.h>
#include <string.h>
#include <sys/time.h>
typedef struct interpose_s {
void *new_func;
void *orig_func;
} interpose_t;
int my_setitimer(int which, const struct itimerval *value,
struct itimerval *ovalue);
static const interpose_t interposers[] \
__attribute__ ((section("__DATA, __interpose"))) = {
{ (void *)my_setitimer, (void *)setitimer },
};
int my_setitimer(int which, const struct itimerval *value,
struct itimerval *ovalue)
{
if (value) {
struct itimerval new_value;
memcpy((void *)&new_value, (void *)value, sizeof(struct itimerval));
new_value.it_value.tv_sec += 1;
return setitimer(which, &new_value, ovalue);
}
return setitimer(which, value, ovalue);
}
== cut here ==
This intercepts and reimplements setitimer(), adding 1 second to the value
specified in the incoming setitimer
() call from emacs.
To compile, do something like:
$ gcc -Wall -dynamiclib -o /tmp/libsetitimer.dylib libsetitimer.c
To cause it to be used in a precompiled version of emacs, run your emacs
command something like:
$ DYLD_INSERT_LIBRARIES=/tmp/libsetitimer.dylib /sw/bin/emacs ...
See if it changes the behavior. Experiment with different tweakings of it_value
if necessary.
Original comment by si...@gmail.com
on 11 Jun 2007 at 8:32
This workaround does indeed work. Thanks! I've shimmed it into my emacs for
now.
Question: sshfs does have an "-o intr" option which allows operations to be
interrupted. I did not set this option, so why were the emacs open() file
system
calls being interrupted by a SIGALRM?
Another question: I see that around Jun-Nov 2006, the original FUSE was
enhanced to
handle interruptions of the user process more robustly. Did these changes make
it
into MacFUSE? (version 2.6.0; release notes below)
http://sourceforge.net/project/shownotes.php?release_id=457591&group_id=121684
Thanks for your help!
Original comment by cbmarkwa...@gmail.com
on 11 Jun 2007 at 4:20
> Another question: I see that around Jun-Nov 2006, the original FUSE was
enhanced to
> handle interruptions of the user process more robustly. Did these changes
make it
> into MacFUSE? (version 2.6.0; release notes below)
The changes can't just "make into" MacFUSE because MacFUSE (the kernel portion)
is an
OS X specific implementation that shares nothing with the Linux implementation.
In
our context here, FUSE is an API--a specification. Linux FUSE is one
implementation,
MacFUSE is another. MacFUSE will have to have its own implementation of the
FUSE_INTERRUPT message, which it doesn't support yet.
Original comment by si...@gmail.com
on 11 Jun 2007 at 7:53
Thanks, I understand.
So what happens now in the case that the user-space daemon is busy and a signal
arrives to the client program? If the signal is ignored, then the daemon would
not
be interrupted, and data should be returned ok. So that suggests that signals
*are*
being intercepted before the daemon can return its reply to the client. Seems
like
that could disturb the semantics of open(...,O_CREAT) and write(), which is in
fact
what is happening here. The perversity of the emacs-X11 signal model has only
magnified the issue.
I tried browsing the MacFUSE kernel code, but I couldn't orient myself enough to
figure out what happens.
Thanks again for your timely help.
Original comment by cbmarkwa...@gmail.com
on 11 Jun 2007 at 9:38
In real life, not having FUSE_INTERRUPT shouldn't bite applications much. I do
intend to implement it--just need
to find more time.
Original comment by si...@gmail.com
on 12 Jun 2007 at 5:32
Thanks, but my question was, if MacFUSE doesn't support "interrupts", then why
is the
syscall interrupted at all?
Original comment by cbmarkwa...@gmail.com
on 12 Jun 2007 at 9:21
What I said was that MacFUSE doesn't support the FUSE_INTERRUPT *message* of
the FUSE API, which means it
doesn't support passing interruption notification up to the user-space daemon.
You do want system call
interruption to be still possible (to avoid nasty hangs) and it works out fine
in typical cases.
Original comment by si...@gmail.com
on 12 Jun 2007 at 10:51
For reference: another report in issue 236.
Original comment by si...@gmail.com
on 8 Jul 2007 at 8:45
I got this from emacs developer YAMAMOTO Mitsuharu:
"In the Carbon port, the SIGALRM duration is 2
seconds by default and it's not too frequent. If some file operation
takes much more time than that period, then it is desirable that it
works with signals so users/applications can interrupt the long
operation, IMO.
You can set [the duration] via `polling-period' if you need."
He asked if there was anything specific that emacs could do differently to make
it
work better with MacFUSE. If anybody has ideas, post here and I'll route them
back.
-Bill
Original comment by flowe...@gmail.com
on 27 Jul 2007 at 2:33
I believe that making the duration of the SIGALRM timeout longer than 100 msec
is
definitely going to help this issue. While the 100 msec duration is a bit
exorbitant, Emacs is doing the "right thing" by retrying file operations when
they
are interrupted by a SIGALRM. MacFUSE appears to be getting confused because it
allows stateful file operations to be interrupted after the state change has
already
occurred.
Original comment by cbmarkwa...@gmail.com
on 28 Jul 2007 at 9:00
i have pretty much the same problem, so i tried the libsetitimer fix... but i
got
stuck because it says:
tcsh: DYLD_INSERT_LIBRARIES=/tmp/libsetitimer.dylib: Command not found.
am i missing something? thanks, and sorry i'm so ignorant.
Original comment by michael....@gmail.com
on 8 May 2008 at 8:11
Original issue reported on code.google.com by
cbmarkwa...@gmail.com
on 11 Jun 2007 at 3:16