sahlberg / libnfs

NFS client library
Other
532 stars 203 forks source link

nfs_pwrite() call hangs if there is insufficient disk space on NFS server #493

Open bishnu1184 opened 3 months ago

bishnu1184 commented 3 months ago

Hi Ronnie,

We are using libnfs version 5.0.1.0. In one of our testing we observed an issue with nfs_pwrite() api. When we have insufficient disk space on NFS share we still attempt to write then mfs_pwrite() api get hang. Ex. Assume available space is only 2MB and we are uploading a file of size 7MB size, and we are doing write in 1MB chunk size, then after second write call third write call hangs and API doesn't return or throw any error.

Is it a known issue or is there anything we can do differently without getting hang ?

Thanks, Bishnu

sahlberg commented 3 months ago

That sounds like a bug.

Do you have more details about the setup and how you are doing the writes so I can try to reproduce?

On Thu, 29 Aug 2024 at 15:26, bishnu1184 @.***> wrote:

Hi Ronnie,

We are using libnfs version 5.0.1.0. In one of our testing we observed an issue with nfs_pwrite() api. When we have insufficient disk space on NFS share we still attempt to write then mfs_pwrite() api get hang. Ex. Assume available space is only 2MB and we are uploading a file of size 7MB size, and we are doing write in 1MB chunk size, then after second write call third write call hangs and API doesn't return or throw any error.

Is it a known issue or is there anything we can do differently without getting hang ?

Thanks, Bishnu

— Reply to this email directly, view it on GitHub https://github.com/sahlberg/libnfs/issues/493, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADY3EFYA6Q4LMMBIXMV65TZT2WJ3AVCNFSM6AAAAABNJUDWJ2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ4TGNBYGMYTKMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

bishnu1184 commented 3 months ago

I can certainly try to explain more details if that will help you in reproducing the issue.

  1. NFS server is hosted on RHEL server.
  2. Assuming we are hosting nfs share on /home partition in the machine, then we are filling that partition with some test data so that no space remains available there. Now if we try to create a file the file creation failed and API returns.
  3. Now suppose we haven't filled that partition completely and available space is 2 MB only but we are trying to upload a file of 7 MB in size. As we are writing the file in chunks, all write operations are success till the time space was available there (i.e. 2MB), but after the space got exhausted next write operation get stuck and API nfs_pwrite() is returning from the call.

So, in above case expectation was API should return failure as disk space is exhausted and should not get hang, but it is getting hang.

Please let me know if you need any other information.

Thanks, Bishnu

sahlberg commented 2 months ago

I can not reproduce on current master. If the write RPC call fails with -ENOSPC then this is correctly retruned to the application as an error to the nfs_pwrite() call.

I updated nfs-cp.c to print the RPC layer error as well when nfs_pwrite() fails :

$ rm -f /mnt2/fail;sudo ./utils/nfs-cp 10M nfs://127.0.0.1/mnt2/fail Failed to write to dest file pwrite call failed with "NFS: Write failed with NFS3ERR_NOSPC(-28)"

Please switch to current master, it contains zero-copy read support which will have a big impact on read-intensive applications. There is an API change in current master compared to earlier versions but this is documented in README.

ifdef LIBNFS_API_V2

can be used to check for whether the new API is available or if it is the old api.