gluster / glusterfs

Gluster Filesystem : Build your distributed storage in minutes
https://www.gluster.org
GNU General Public License v2.0
4.51k stars 1.07k forks source link

prevent gnfs IO Errors on smaller files #4319

Closed erikja closed 1 month ago

erikja commented 1 month ago

In certain situations, smaller files will report I/O errors when accessed from NFS using Gluster NFS. With our settings, files up to 170M could report this in some cases. It was not a consistent failure.

Disbling the NFS performance I/O cache seemed to work around the instances of the problem observed for non-sharded volumes.

Research showed that gluster NFS is relying on an errno return value of EINVAL to detect EOF and set is_eof. However, in some paths this value was not retained or was reset to zero.

This change passes the errno along so it can be used by gluster NFS. We found the issue in the shard xlator and the io-cache xlator.

gluster-ant commented 1 month ago

Can one of the admins verify this patch?

gluster-ant commented 1 month ago

Can one of the admins verify this patch?

gluster-ant commented 1 month ago

Can one of the admins verify this patch?

erikja commented 1 month ago

An update. I was able to dedicate some time and come up with a more repeatable test case isolated from our more complicated usage environment. I have documented each step of the way.

I was able to duplicate the problem with the simplified setup as requested in the mailing list. I was able to duplicate it in the sharded and non-sharded case (which goes down different code paths but both paths have a similar problem in one spot).

I am getting ready to get my patch going -- but as modified due to the suggestion in this PR - and I will send an update with results.

If things look good, I will mail the test case to the community, do an update here, force-push an improved version, mark this as ready for review, etc. But that's getting ahead of myself. More soon.

erikja commented 1 month ago

I am not used to how github work sand made a new PR by accident. But the new PR does have the new change. So I will close this one.

erikja commented 1 month ago

https://github.com/gluster/glusterfs/pull/4322