borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.
https://www.borgbackup.org/
Other
11.14k stars 742 forks source link

unable to create hardlink on NFS, HP StoreOnce NAS #4336

Closed Dizzy3339 closed 5 years ago

Dizzy3339 commented 5 years ago

Have you checked borgbackup docs, FAQ, and open Github issues?

Yes, extensively, have researched for over 3 weeks with no luck, including other GitHub issues with a similar error message.

Is this a BUG / ISSUE report or a QUESTION?

Unsure, could be either, or maybe it's a compatibility problem

System information. For client/server mode post info for both machines.

Server: HPE StoreOnce NAS appliance (NFS3) Client: SUSE Linux Enterprise Server 11 (x86_64) VERSION = 11 PATCHLEVEL = 4

Your borg version (borg -V).

borg-linux64 1.1.7

Operating system (distribution) and version.

SLES 11.4

Hardware / network configuration, and filesystems used.

HPE CS500, 10GB ethernet, 1.5TB RAM, 72 cores

How much data is handled by borg?

Anywhere from 1GB to 1.7TB per run

Full borg commandline that lead to the problem (leave away excludes and passwords)

client:/storeonce # borg-linux64 init --encryption=repokey /storeonce/TEST
Enter new passphrase:
Enter same passphrase again:
Do you want your passphrase to be displayed for verification? [yN]:
Local Exception
Traceback (most recent call last):
  File "borg/archiver.py", line 4434, in main
  File "borg/archiver.py", line 4366, in run
  File "borg/archiver.py", line 152, in wrapper
  File "borg/archiver.py", line 266, in do_init
  File "borg/crypto/key.py", line 111, in key_creator
  File "borg/crypto/key.py", line 678, in create
  File "borg/crypto/key.py", line 788, in save
  File "borg/repository.py", line 299, in save_key
  File "borg/repository.py", line 281, in save_config
OSError: [Errno 5] Input/output error: '/storeonce/TEST/config' -> '/storeonce/TEST/config.old'

Describe the problem you're observing.

I have been able to successfully run the init on other targets, example, local disk, another NFS appliance (Ibrix brand). For some reason borg and this appliance do not get along with the init command. When I have had our team reshare a mount from the appliance as CIFS, the init works. (But other apps cannot use CIFS and doesn't make sense to use CIFS with Linux).

I can run the command above (cp or mv /storeonce/TEST/config /storeonce/TEST/config.old) with no issues, but looks like borg is stopping there.

I've tried mounting the client share with so many different options, trying umask, gid, noca, lookupcache=none, etc., but still no luck. I keep thinking permissions issue but not able to find the magic combination that will work.

I have also tried mounting this to an Ubuntu workstation and it has the same error message, and other targets (local, other NFS appliance) work with no problem.

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

Yes, check out the output above.

Include any warning/errors/backtraces from the system logs

Nothing I have been able to see on the actual OS. I don't think we are able to access /etc/exports to do stuff like root_no_squash since the appliance is presenting the administration of the shares through a pre-bundled GUI.

ThomasWaldmann commented 5 years ago
        if os.path.isfile(config_path):
            try:
                os.link(config_path, old_config_path)  # line 281
            except OSError as e:
                if e.errno in (errno.EMLINK, errno.ENOSYS, errno.EPERM, errno.ENOTSUP):
                    logger.warning("Failed to securely erase old repository config file (hardlinks not supported>). "
                                   "Old repokey data, if any, might persist on physical storage.")
                else:
                    raise
ThomasWaldmann commented 5 years ago

So, we expect an OSError there (with errno like seen in line 283).

It looks like your filesystem has some issue with creating that hardlink, but gives errno.EIO ("Input/Output Error") and not one of the specific errnos we catch there.

Not sure whether this is a bug in your filesystem or whether we should also catch errno.EIO there.

In any case, if the hardlink does not work, we can not secure erase the old config file.

ThomasWaldmann commented 5 years ago

https://github.com/pgbackrest/pgbackrest/issues/592 here someone gets EIO for a symlink (also he seems to run out of file handles).

ThomasWaldmann commented 5 years ago

BTW, the command you want to try is ln file1 file2 (not mv).

ThomasWaldmann commented 5 years ago

https://tools.ietf.org/html/rfc1813

2.6 Defined Error Numbers

   NFS3ERR_IO (5)
       I/O error. A hard error (for example, a disk error)
       occurred while processing the requested operation.

That sounds like borg rather should not catch it and you need to search for the root cause.

Dizzy3339 commented 5 years ago

Wow, thank you Thomas for the super quick response and all the digging. The hard link doesn't work, and neither does a soft link either !

Let me check some of your other links and see if we can find out how to allow links on that filesystem.

client:/storeonce/TEST> ln config config.old ln: failed to create hard link config.old' =>config': Input/output error client:/storeonce/TEST> ln -s config config.old ln: failed to create symbolic link `config.old': Input/output error

Dizzy3339 commented 5 years ago

Just read the Postgres link and looks like our exact problem. I am going to dig around and see if I can find anyone that has successfully allowed symbolic links on this appliance and if I can't find anything, will ask our team to open a case up with HPE.

Dizzy3339 commented 5 years ago

I haven't been able to see anyone that's been able to get symlinks to work on the StoreOnce appliance.. The guy in the Postgres thread said that he would open a case up but hasn't updated since then.

I've asked our team to open a case up with HPE to see if they can recommend a solution. Will let you know what they come back with.

ThomasWaldmann commented 5 years ago

Just to make it clear:

The problem on that NAS:

Dizzy3339 commented 5 years ago

Just to make it clear:

to work optimally and be able to secure erase the old config, borg needs to make a hardlink it still works sub-optimally (no secure erase, but also no crash), if creating the hardlink fails with one of the expected errnos (see #4336 (comment) ).

The problem on that NAS:

looks like no support for hardlinks fails with the wrong errno (I/O error)

We got an answer back from HPE that they had no documentation supporting symbolic links. This confirms the same answer the other Postgres user had.

We will stick with CIFS for now for BORG on this appliance as it seems to work on there with no issues.

This is very bizarre for an enterprise appliance to not support decades old functionality but I'm sure there is a reason behind it.

Thanks again Thomas for your awesome help. I will close this one.

ThomasWaldmann commented 3 years ago

this was "fixed" recently by #5463 (master) and #5523 (1.1-maint).