laurent22 / rsync-time-backup

Time Machine style backup with rsync.
3.38k stars 446 forks source link

hardlinks not working #190

Open williwoods opened 4 years ago

williwoods commented 4 years ago

Super newb here so I hope I am not missing something super obvious.

When I run your script each file is being written again despite there being an existing copy of the file from a previous backup. I also ran out of space on my destination so it appears that auto deletion is also not functioning.

I tested this out with simple test source directory of 27MB on the desktop and the destination grew by 27MB exactly each time I ran it. So after three runs the destination folder was ~81MB.

I havn't customized the default behavior in anyway, when I get flags on the script I get the following:

-D --numeric-ids --links --hard-links --one-file-system --itemize-changes --times --recursive --perms --owner --group --stats --human-readable

What am I doing wrong?

prontog commented 4 years ago

While running some tests on Windows 10 with WSL (Ubuntu 16.04) I noticed the same thing. Since I mainly use linux, I didn't troubleshoot it. A couple of guesses:

  1. dest file system does not support hard links.
  2. hard links are actually created but the program that calculates the size on disk does not take them into account.

For the auto-deletion problem it would be better to create a separate issue.

williwoods commented 4 years ago

Thanks so much for the reply and effort to duplicate my issues.

I have since confirmed that auto-deletion is in fact working so we can eliminate that issue from this discussion for now.

The destination filesytem is something I suspected. I am trying to do this on Avid Nexis which is a custom filesystem. I can check with the manufacturer to see if hardlinks are supported or not and what their FS is based on.

With this in mind, I did a test on my desktop which is MacOs Sierra 10.12.6 and the drive is formatted Mac OS Extended (journaled). When I was mentioning in my original post about the simple test that I did this is what I was talking about. See the attached screenshot of the results. I'm just doing a "get info" on each destination backup directory and comparing the total size to see if they are all the same size or smaller (hardlinked) compared to the source directory.

Is there a better way to check that these backups are hardlinked beyond my method of checking the directory with finder (get info)?

Screen Shot 2019-12-18 at 10 12 35 AM

mylos commented 4 years ago

Use "du -h folder" to check the effective size of the given folder. To check if a file is a hard link see: https://unix.stackexchange.com/questions/170444/how-to-find-out-a-file-is-hard-link-or-symlink

williwoods commented 4 years ago

Great suggestion on "du -h folder".

I used "du -sh folder" instead so it wouldn't count up all subfolders. Using this method I was able to confirm that hardlinks are working on the Mac OS Extended FS destination but not working on the Avid Nexis FS destination backups.

Dang, so close.

This script works so well from what I am seeing that I will have to see if we can identify another storage destination to use that supports hardlinks.

reactive-firewall commented 4 years ago

so a quick look at the code shows part (if not most ... or even all) or this hardlink issue is here: https://github.com/laurent22/rsync-time-backup/blob/88db869fe7a52864e18afc7e16a971499f79e830/rsync_tmbackup.sh#L391 and the hardcoded values: https://github.com/laurent22/rsync-time-backup/blob/88db869fe7a52864e18afc7e16a971499f79e830/rsync_tmbackup.sh#L282

This is could be an easy fix (caveat: the following may not be the best way to fix it) add a check to test for hardlinks (copy same idea for testing source):

fn_check_dest_hlink_support() {
TEST_RES=0 ;
 # create source of link
fn_run_cmd "echo 'hlink test' > ${DEST_FOLDER}/.hardlink_test_root || exit 1 ;" || TEST_RES=1 ;
# attempt link creation
fn_run_cmd "ln -f ${DEST_FOLDER}/.hardlink_test_root ${DEST_FOLDER}/.hardlink_test_check || exit 1 ;" || TEST_RES=1 ;
# check link mode
if [ $(fn_run_cmd "ls -i1 ${DEST_FOLDER}/.hardlink_test_root | sort | grep -oE [0-9]+ | head -n 1" ) -eq $(fn_run_cmd "ls -i1 ${DEST_FOLDER}/.hardlink_test_check | sort | grep -oE [0-9]+ | head -n 1" ) ] ; then TEST_RES=0 ; else TEST_RES=1 ; fi
# clean up
fn_run_cmd  "rm -f ${DEST_FOLDER}/.hardlink_test_check || true ;"
fn_run_cmd "rm -f ${DEST_FOLDER}/.hardlink_test_root || true ;"
return ${TEST_RES}
}

caveat: the above code is in dire need of portability testing and should be cleaned, but that's something for a PR

with such a check available one needs only check hardlink support somewhere around: https://github.com/laurent22/rsync-time-backup/blob/88db869fe7a52864e18afc7e16a971499f79e830/rsync_tmbackup.sh#L391

perhaps something like ...

# line 391 insert
SUPPORTS_HLINKS=$(fn_check_dest_hlink_support )
if [[ ( $SUPPORTS_HLINKS -eq 0 ) ]] ; then
# change args to support hardlinks
else
# change args to NOT support hardlinks
fi ; 

... well you get the idea

I'm going to look at cleaning up some of the code to provide some basics like all POSSIX filenames (spaces, single quotes, etc) before I create an actual patch for this, but it should provide more of a fix then "just [wastefully] replace a drive you have with another drive"

🤔 ... oh and unit testing will be my next stop ... #193

HOPE THIS HELPS! (feel free to assign this to me if the above is an acceptable approch and I'll circle back with a clean patch for review)