How to reduce the number of backend open calls in NFS-Ganesha to improve performance？

zhengfeihu51 commented 1 year ago

We use vdbench to test the read and write performance of NFS-Ganesha. When the client mounts using the v4 protocol, the nfs4_op_open interface is called before each read or write operation, which in turn calls the open interface of the backend. The reason for this might be that after each read or write operation, the client calls the close interface.

We conducted a comparison test using the v3 protocol and found that the performance was much better than the v4 protocol. With the v3 protocol, the file handle is only opened when the file is created, and then the handle can be reused for subsequent read and write operations. I would like to know if it is possible for the v4 protocol to not open or close the file handle like the v3 protocol does, or if it is possible to reduce the frequent calls to the backend's open interface in order to improve Ganesha's performance (our test scenario does not involve multiple clients accessing the same file).

mattbenjamin commented 1 year ago

nfsv4 is designed to statefully track opens. ganesha opened and closed file handles because the client did. the client need not do that.

Matt

zhengfeihu51 commented 1 year ago

nfsv4 is designed to statefully track opens. ganesha opened and closed file handles because the client did. the client need not do that.

Matt

The client is the Linux kernel, and we cannot control its behavior. Can we optimize it from the server side?

ffilz commented 1 year ago

There really isn't anything we can or should do here. As Matt mentioned, NFSv4 is a stateful protocol and each open represents state that must be maintained. Ganesha doesn't know what the back end filesystem will do with the open state. For example, some projects that utilize Ganesha do some integration with SMB. Minimally, the open file handle assures that the file will not be deleted externally to Ganesha, or that the filesystem will not be unmounted.

I don't see how we could really do anything without reducing the integrity of the protocol implementation,

mattbenjamin commented 1 year ago

also, while the client is the Linux kernel, what I meant to say was, "the application did." when the Linux client does an open, it's because some application is doing an open--the client doesn't do stateless opens, but it doesn't invent opens either--it just proxies them for whatever the application is

zhengfeihu51 commented 1 year ago

If we only want to optimize the CephFS backend, do you have any suggestions? For example, when Ganesha handles the close operation, it can avoid calling the backend's close interface. when Ganesha handles the open operation, it can skip the backend's open interface. It seems unnecessary to frequently perform these operations, as they simply close and open file handles for the same file. @ffilz @mattbenjamin

dang commented 1 year ago

It would be a big spec violation if we didn't close a handle when the client closed the handle, and it could cause many issues, including data loss. It's not uncommon for data and/or metadata to not be committed to disk until the FD is closed. If we artificially keep it open, then crash, data and/or metadata could be lost that the client expects to be persistent.

The proper solution is to fix the workload. (note, this is not the linux kernel, it is the actual application doing the work). The workload knows when files need to be opened and closed, and should do it properly.

mattbenjamin commented 1 year ago

right, that's what I wrote above. I believe Linux clients all send open and close ops to the NFS server/mds in response to application opens and closes--NFS close-to-open semantics relies on this. If this is a distributed NFS workload, then yes, the endpoints are going to in fact need to close and re-open files to reliably see changes made at other clients in the Linux client implementation. and that is what the NFS spec directs. meanwhile, while the native cephfs wire protocol doesn't actually track opens per se, libcephfs uses opens and closes to manage cap lifetime. this probably is slow. going with the hypothetical, yes, you could probably get away with caching ceph handle opens in FSAL_CEPH in some circumstances--it would maybe work so long as there was no contention with other cephfs clients (which could be other nfs-ganesha instances). a read-only workload would be safe. I'd be very surprised if this wouldn't eventually break down badly, however, as soon as some client starts writing data.

zhengfeihu51 commented 1 year ago

I understand that there is a significant risk involved in this approach, assuming that the application cannot be modified. Similarly, assuming that there is no competition between clients accessing the same file, would it theoretically be feasible? I made some modifications locally, with a single client and a single Ganesha server, and sometimes it can successfully complete the operations. However, when multiple clients are reading and writing data to the same Ganesha server, the server may crash (perhaps there is an issue with the modifications I made to the Ganesha code). @dang @mattbenjamin @ffilz

mattbenjamin commented 1 year ago

@zhengfeihu51 I think changing the nfs implementation is a dead end. I don't understand the motivation to run a stateful workload, but then trying to fake its states. I suspect that there's maybe significant opportunity to optimize the performance of libcephfs, that seems like a direction that could get community traction.

dang commented 1 year ago

There may be a way to make the client use stateless NFSv4. I don't know if the linux client supports this or not.

dang commented 1 year ago

Oh, and for the record, I think it would take a lot of convincing to accept a patchset upstream that does this. It's too dangerous, and too much of a violation of the spec.

ffilz commented 1 year ago

Oh, and for the record, I think it would take a lot of convincing to accept a patchset upstream that does this. It's too dangerous, and too much of a violation of the spec.

I agree. There is no server side changes acceptable to address this issue.

Ganesha would definitely support the use of the anonymous stateid which basically makes I/O work like NFSv3 (except it interacts a bit better with deny modes) which means opening the global fd which is then eventually closed by the FD LRU reaper thread.

That WILL implicate some of the issues Matt raised about files being left open.

Getting the client to use anonymous stateid though is tricky....

Might in this case be better to just use NFSv3...

ffilz commented 6 months ago

Closing

nfs-ganesha / nfs-ganesha

How to reduce the number of backend open calls in NFS-Ganesha to improve performance？ #979