clintonjade / macfuse

Automatically exported from code.google.com/p/macfuse
Other
0 stars 0 forks source link

Poor performance opening files #252

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
We have everything set up and working functionally... we connect with:

sshfs -ocache=no -onolocalcaches -o 'IdentityFile=XXXX' 
markf@XXXX:/Users/alanpinstein/dev/sandbox /Volumes/phocoa -
oreconnect,volname=PHOCOA

Even on a local network, there is significant lag in opening trivially small 
files (< 4k). Opening 
such files results in a 65kB/s read/write network IO for several seconds.

Interestingly, the deeper the path to the file, the more data is exchanged. See 
attached graphs of 
network IO load recorded from the SERVER.

Although the functionality of MacFUSE is working great now, the performance is 
so poor that my 
developers essentially refuse to use it.

I don't understand why:
1) There is several seconds of additional overhead (time and network IO) for 
each directory down 
in which the file resides.
2) Even though there is this network IO overhead, this is on a local LAN and 
only hitting 
~70kB/sec. We have tested with both AFP and scp and easily hit 1MB/s+ network 
IO between 
these 2 machines.
3) Why is there ANY significant additional overhead to open a 4k file? I can 
see a little IO 
overhead of course, but having to transfer 100k+ of data to open a 4k file 
doesn't seem right.

We tried looking for debug information, log files, etc, but couldn't find 
anything significant. I'd be 
happy to help debug in any way. This is an important tool for my team to be 
able to work 
remotely.

FWIW, browsing the MacFUSE mount in the finder is very acceptable performance 
and doesn't 
seem to be incurring any unreasonable network IO.

Please advise.

Thanks in advance,
Alan

Original issue reported on code.google.com by apinst...@mac.com on 6 Aug 2007 at 3:38

Attachments:

GoogleCodeExporter commented 8 years ago
> sshfs -ocache=no -onolocalcaches -o 'IdentityFile=XXXX' 
> markf@XXXX:/Users/alanpinstein/dev/sandbox /Volumes/phocoa 
-oreconnect,volname=PHOCOA

I won't rehash (yet again) why "-ocache=no -onolocalcaches" is a *bad* thing, 
even though you might think 
you absolutely need it. Through these operations, you're telling both sshfs and 
MacFUSE to never remember 
anything--neither metadata nor data, about the remote files.

> Interestingly, the deeper the path to the file, the more data is exchanged.

Yes, that's how file system lookups work--the more the path components, the 
more the number of lookups. 
Because you've turned off lookup caching, anytime a directory lookup is 
performed, things have to go all the 
way from your application to the kernel to the sshfs daemon to the server and 
back all the way.

> Even on a local network, there is significant lag in opening trivially small 
files (< 4k). Opening 
> such files results in a 65kB/s read/write network IO for several seconds.

"Opening" a file shouldn't depend on the size of the data within the file. As 
for 65KB/s write I/O, well, 
depending on which application is opening the file, it could be just listing 
directory contents, etc., which, 
given the cacheless operation, will go over the network every time. It also 
depends upon the layout of your 
remote file system (how many files in a directory, etc.). I don't know why 
there is write I/O. Maybe the 
application is creating a scratch or backup file. If there's write I/O, well, 
somebody is writing--MacFUSE itself 
doesn't do reads or writes by itself.

Try removing the "-ocache=no -onolocalcaches" arguments as a first step to see 
what kind of performance 
you get. The cacheless operation thing is a heavy hammer. The correct way to 
solve the "but I really do want to 
pick up all remote file changes that happened on some other machine" is not 
through these options, but 
through a file system protocol that is aware of such changes. That's why people 
created NFS, Coda, and such. 
People expect sshfs to be something that it's not, hence the issues.

All said, there still sounds something weird about your environment. I use 
sshfs, both on a 802.11g wireless 
network and a 100 megabit/Gigabit LANs, and I've not experienced what you're 
describing--even *with* 
cacheless operation.

Another experiment you could do is to connect two Macs--say, two MacBook Pro's 
or something--directly 
through an Ethernet cable (the so called "crossover" connection), then mount 
one on another through sshfs. 
See how the performance is with and without cacheless operation.

Original comment by si...@gmail.com on 6 Aug 2007 at 4:36

GoogleCodeExporter commented 8 years ago
First off, thanks for responding so quickly! That is very nice of you for the 
community.

So, we reproduced our experiments based on some of your ideas.

1) It is the applications... using "cat" on the remote mount was instantaneous. 
TextEdit.app was a little slower, with BBEdit and TextMate being horrible.
2) It is the caching; we turned that OFF and things were instant again as well.

Also, I think it is finally getting through my head what you mean by "MacFUSE 
is not a filesystem". However, I must push back on you a little bit, 
because the expectation from the name and marketing of the project doesn't 
really reinforce this.

At this point, I think I understand how I should view MacFUSE; as a 
Finder-compatible FTP interface, not a filesystem. Right?

Also, I'd point out that you say that:

> I won't rehash (yet again) why "-ocache=no -onolocalcaches" is a *bad* thing, 
even though you might think 
> you absolutely need it. Through these operations, you're telling both sshfs 
and MacFUSE to never remember 
> anything--neither metadata nor data, about the remote files.

However, I just looked at the FAQ entry again about this and there is nothing 
in there to indicate that you think it's a bad idea or that performance 
gets horrible when using apps that talk a lot over the FS. It's just not clear, 
but at least now I understand the issue.

Also, I'd say that while now I understand how it can be so slow, I am still 
curious as to WTF these apps are DOING to thrash the FS so much. I would 
have been able to "help myself" if there existed a debug mode of some kind that 
would list the file operations as they were requested as a debug tool.

Other than that, I see your point(s) and thank you for elucidating! We will 
just work with caching turned on again and keep out of each others way (ie 
pretend it's just a convenient sFTP interface).

Thanks,
Alan

Original comment by apinst...@mac.com on 6 Aug 2007 at 5:58

GoogleCodeExporter commented 8 years ago
> Also, I think it is finally getting through my head what you mean by "MacFUSE 
is not a filesystem".
> However, I must push back on you a little bit,  because the expectation from 
the name and marketing of the 
project doesn't really reinforce this.

MacFUSE is a file system in the technical sense. It's a file system from the 
kernel/system's standpoint. 
However, whereas conventional file systems would either store things on disk or 
across the network, MacFUSE 
depends on a normal user program providing file system "data". In turn, the 
user program can get the data 
from wherever it wants: local disk, across the network, or just cook it up in 
memory.

> At this point, I think I understand how I should view MacFUSE; as a 
Finder-compatible FTP interface, not a 
filesystem. Right?

No, not at all--this is incorrect. The problem is that people's understanding 
of the term "file system" is all 
over the place based on their computing backgrounds. Maybe you don't need to 
view MacFUSE as anything at 
all. There's MacFUSE, and then there are specific instances of file systems 
written *on top of* MacFUSE. sshfs 
and ftpfs are two examples. MacFUSE doesn't know where the data is ultimately 
coming from--it's the user 
program (sshfs in this case) that gets the data. Perhaps the following will 
help:

* From the operating system's standpoint, MacFUSE is the file system.
* From MacFUSE's standpoint, the user space program (like sshfs) is the file 
system.
* From many Mac end users' standpoint, what they see in the Finder is the file 
system.

sshfs uses MacFUSE to make things look like a file system, and uses SFTP to 
actually get/put those things. 
Another program could use MacFUSE for the same purpose, but use a different 
protocol (other than SFTP) for 
across-the-network communication. That protocol could actually be specifically 
geared for concurrent file 
sharing, attribute caching, distributed locking and such--SFTP is not. *Such 
things* are examples of what I 
mean when I say that MacFUSE doesn't know about the storage-facing specifics of 
the file system. Its job is to 
take what the user program gives it and make it look like a file system. The 
Finder is just a GUI atop all this. I 
hope this helps clarify.

> However, I just looked at the FAQ entry again about this and there is nothing 
in there to indicate that you
> think it's a bad idea or that performance gets horrible when using apps that 
talk a lot over the FS. It's just 
> not clear, but at least now I understand the issue.

I've rehashed it in the past issues, and in the macfuse-devel forum. I don't 
have the resources to put 
everything in all sorts of documentation, but I try.

Plus, performance doesn't get horrible for everybody (I gave you my 
counterexample). Given things like 
resource forks, custom icons, Finder flags, and what not, file system traffic 
is much higher on Mac OS X (than 
say, on Linux) in *apparently similar* circumstances. If you enabled debugging 
(the -d option while 
mounting), you'll see many calls for "._" files. You amplified the cost of this 
traffic by turning off all caches. 
And I still think there's something else about your environment because I don't 
see similar issues.

> Also, I'd say that while now I understand how it can be so slow, I am still 
curious as to WTF these apps
> are DOING to thrash the FS so much. I would have been able to "help myself" 
if there existed a debug
> mode of some kind that would list the file operations as they were requested 
as a debug tool

When you run sshfs, run it from the command line and add the '-d' option.

Original comment by si...@gmail.com on 6 Aug 2007 at 6:53

GoogleCodeExporter commented 8 years ago
Thanks for the clarifications. It doesn't pay to go back through all of the 
miscommunications we're having 
about how I am mentally building a model of how the system works.

Suffice it to say that we got things working well enough again, and I 
definitely understand the stack a little 
more now.

We also ran a debug session (this time with the -d rather than -o sshfs_debug, 
which didn't do much) and we 
saw all of the traffic you mention from these apps. Definitely causes a lot of 
problems the way these apps hit 
the FS all the time.

I can tell you though, that I do try to read and learn before going to issue 
trackers and such, and I think it 
would have helped out a lot if there were readily available info explaining the 
pieces of the stack (FUSE, sshfs, 
etc etc). Even if they were links to other tutorials. I think the part that 
really hung me up is that the home page 
says:

"MacFUSE implements a mechanism that makes it possible to implement a fully 
functional file system in a 
user-space program on Mac OS X..."

While I think I see now that the extent to which "fully functional" occurs in 
reality depends on the *fs 
implementation, that's not immediately clear from the home page. This sentence 
(which was my first intro to 
FUSE) drilled into my head "everything that MacFUSE implements should work like 
a "normal" filesystem on 
disk or through AppleShare/SMB, etc). 

Anyway, just a though.

Thanks again for your quick responses. It's much appreciated!

Original comment by apinstei...@gtempaccount.com on 7 Aug 2007 at 12:55

GoogleCodeExporter commented 8 years ago

Original comment by si...@gmail.com on 7 Aug 2007 at 7:37