nfs-ganesha / nfs-ganesha

NFS-Ganesha is an NFSv3,v4,v4.1 fileserver that runs in user mode on most UNIX/Linux systems
1.48k stars 513 forks source link

NFS-Ganesha + GlusterFS : deleting file not release the capacity, but store the deleted file in .glusterfs/unlink/ folder #1057

Open feikiss opened 8 months ago

feikiss commented 8 months ago

I use NFS-Ganesha and export Glusterfs volume , if the files are copied in the nfs client , then I delete them in nfs server path, they will be moved to .glusterfs/unlink folders instead .

details: nfs-ganesha-4.4-0.1 glusterfs: 10.0

steps to reproduce:

  1. Create Glusterfs volume gv0 on Server A and export gv0 as /share via nfs-ganesha
  2. On Client B: mount nfs , mount -t nfs A:/share /mnt
  3. On Client B, do some IO operation : cp /mnt/text.zip /home/temp/
  4. On Server A, delete the file: rm /share/text.zip
  5. Then the files will be moved to /share/.glusterfs/unlink and the disk capacity is still not released.

Is there any configuration can fix it in NFS-Ganesha? Thanks. aaa

kalebskeithley commented 8 months ago

Until the file is deleted from .glusterfs/unlink the space is not going to be released — that's normal Unix/Linux file system semantics. It's not a bug in ganesha, there's nothing ganesha can do about it.

It's probably a bug in glusterfs if they files are not quickly removed from .glusterfs/unlink.

ffilz commented 8 months ago

This may be happening because some other process has the file open. Ganesha should close any opens it has for files that are deleted.

cksincere commented 8 months ago

Continuing to verify this issue, I found that after restarting the nfs-ganesha service, the unlink files can be cleaned up immediately. Is this a known issue?

ffilz commented 8 months ago

What NFS versions are the clients using?

Are you using FSAL_GLUSTER?

Is there a single Ganesha node, or are there two? It's not quite clear what your setup is.

This sounds like normal behavior if the clients are using NFSv3 since in that case, Ganesha keeps open file descriptors for some time after I/O (to prevent every I/O from being open/read or write/close). If a fd is kept open, then Gluster needs to remove the name from the directory (to accomplish the unlink) but has to retain the file somewhere so the processes with open fd can still access it.

Oh, it looks like FSAL_GLUSTER holds a reference to the handle as long as Ganesha's mdcache has an entry for the file. Ganesha will only uncache the entry when the unlink comes through that Ganesha instance. So if you have multiple Ganesha instances, the other instances may hold an entry, same if the unlink (as in your case) is done external to Ganesha.

Kaleb, what level of invalidate upcalls does FSAL_GLUSTER support?

Ganesha 5.7 does provide a bit more cache management that might flush out the mdcache quicker and thus release space quicker.

feikiss commented 8 months ago

What NFS versions are the clients using?

Are you using FSAL_GLUSTER?

Is there a single Ganesha node, or are there two? It's not quite clear what your setup is.

This sounds like normal behavior if the clients are using NFSv3 since in that case, Ganesha keeps open file descriptors for some time after I/O (to prevent every I/O from being open/read or write/close). If a fd is kept open, then Gluster needs to remove the name from the directory (to accomplish the unlink) but has to retain the file somewhere so the processes with open fd can still access it.

Oh, it looks like FSAL_GLUSTER holds a reference to the handle as long as Ganesha's mdcache has an entry for the file. Ganesha will only uncache the entry when the unlink comes through that Ganesha instance. So if you have multiple Ganesha instances, the other instances may hold an entry, same if the unlink (as in your case) is done external to Ganesha.

Kaleb, what level of invalidate upcalls does FSAL_GLUSTER support?

Ganesha 5.7 does provide a bit more cache management that might flush out the mdcache quicker and thus release space quicker.

Hi @ffilz , environment details: nfs-ganesha-4.4-0.1 glusterfs: 10.0 yes, we are using multiple ganesha instance (3 instance) based on the same gluster volume and export nfs service via nginx proxy.

Finally we found the reason somehow... the path and the Pseudo is not same, when we changed them same value, the problem not occured any more. We will keep watching the change and update it.

detail config :

`EXPORT { Export_Id = 12323; # Export ID unique to each export Path = "/var/share/data"; # Path of the volume to be exported. Eg: "/test_volume"

FSAL {
    name = GLUSTER;
    hostname = "127.0.0.1";
    volume = "gv0";  # Volume name. Eg: "test_volume"
}

Disable_ACL = TRUE;  # To enable/disable ACL
Pseudo = "/pseudo_nv0";  # if not same with Path variable, the problem occurs. 
Protocols = "3","4" ;    # NFS protocols supported
Transports = "UDP","TCP" ; # Transport protocols supported
SecType = "sys";     # Security flavors supported

}`

Pseudo and Path must keep same, is this a bug or expected?

ffilz commented 8 months ago

I'm not sure why having Path and Pseudo the same impact this, though you might consider NFS_CORE_PARAM { Mount_Path_Pseudo = TRUE; } that will make NFSv3 clients specify the same path on mount as NFSv4 clients.

Could you please share more details of the mount commands and exactly where the rm of the file is done?

feikiss commented 8 months ago

@ffilz share more details of the mount commands: We just mount in nfs v3 mode, command in client: mount -t nfs VIP:/var/share/data/ /mnt

where the rm of the file is done? The rm file operation is done on the Server side that ganesha installed, the NFS client has no write/delete permission.

full configuration of nfs-ganesha :

EXPORT_DEFAULTS {

        Access_Type = RO;
        Anonymous_uid = 12345;
        Anonymous_gid = 12345;
        Squash = root_squash;
        Transports = "UDP","TCP" ; 
}

EXPORT {
    Export_Id = 12323;  
    Path = "/var/share/data";

    FSAL {
        name = GLUSTER;
        hostname = "127.0.0.1";
        volume = "gv0"; 
    }

    Disable_ACL = TRUE;  
    Pseudo = "/pseudo_nv0";  
    Protocols = "3","4" ;    
    Transports = "UDP","TCP" ;
    SecType = "sys";     
}

LOG {

        Default_Log_Level = INFO;
             Facility {
                name = FILE;
                destination = "/var/log/glusterfs/ganesha.log";
                enable = active;
        }
}
feikiss commented 7 months ago

Hi @ffilz , any other info need I append?