Sometimes, the available disk space doesn't increase after we removed a file with command rm in linux system, and we can't find such file in system. I will do a test on a linode linux machine as the follwoing:
Before creating a file:
root@localhost:~# df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/root 20G 3.1G 16G 17% /
create a file test_rm.log:
root@localhost:~# for i in `seq 1 100`
> do
> cat /usr/bin/python2.7 >> test_rm.log
> done
root@localhost:~# ls -lsh test_rm.log
361M -rw-r--r-- 1 root root 361M Aug 21 06:43 test_rm.log
root@localhost:~# df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/root 20G 3.4G 16G 19% /
The used disk space has increased. But when I removed the file with command rm, the available disk space didn't increase.
root@localhost:~# rm test_rm.log
root@localhost:~# df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/root 20G 3.4G 16G 19% /
Why? Are there any other processes holding the deleted file? With command lsof | grep '(deleted)', I get the process.
From the command strace output, we can get the most important syscall unlinkat(operates in the same way as unlink).
unlink() deletes a name from the filesystem. If that name was the last link to a file and no processes have the file open, the file is deleted and the space it was using is made available for reuse.
If the name was the last link to a file but any processes still have the file open, the file will remain in existence until the last file descriptor referring to it is closed.
So, it's clear to see. The test_rm.log is the last link, but also referred by a file descriptor in another process. And it doesn't seem any advantages if the kernel reclaims disk space when unlink is called on the last link. While there are other file descriptors refer to the deleted file, processes may probably continue to write data into the file with their own offsets. At that time, kernel needs to allocate space, and the offset is larger, the space is larger.
If you want to increase disk space by removing files with command rm, you can avoid a lot strange cases by checking whether there are any processes are referring to the file. And find-and-remove-large-files-that-are-open-but-have-been-deleted also gives a good answer.
Sometimes, the available disk space doesn't increase after we removed a file with command
rm
in linux system, and we can't find such file in system. I will do a test on a linode linux machine as the follwoing:Before creating a file:
create a file
test_rm.log
:The used disk space has increased. But when I removed the file with command
rm
, the available disk space didn't increase.Why? Are there any other processes holding the deleted file? With command
lsof | grep '(deleted)'
, I get the process.The ipython process has already opened the file, and the disk space hasn't been reclaimed.
After ipython process exited, the used space decreased as expected. But how command
rm
really works?From the command
strace
output, we can get the most important syscall unlinkat(operates in the same way as unlink).So, it's clear to see. The test_rm.log is the last link, but also referred by a file descriptor in another process. And it doesn't seem any advantages if the kernel reclaims disk space when
unlink
is called on the last link. While there are other file descriptors refer to the deleted file, processes may probably continue to write data into the file with their own offsets. At that time, kernel needs to allocate space, and the offset is larger, the space is larger.If you want to increase disk space by removing files with command
rm
, you can avoid a lot strange cases by checking whether there are any processes are referring to the file. And find-and-remove-large-files-that-are-open-but-have-been-deleted also gives a good answer.