Open geraldhofer opened 7 years ago
The first inconsistency happens because robinhood first calls llapi_hsm_import()
and then rebinds the old fid to the new one. If the robinhood process gets killed before the rebinding happens (or if the rebind fails), the file is still created, in state released
, and cannot be restored because its fid does not point to the data archived (hence the md5sum: fc: No data available
).
The "recovery becomes a bit messy" because when you use:
[root@tclin01 siteAB]# rm fc
rm: remove regular file ‘fc’? y
It adds another entry in the SOFT_RM table (the one robinhood uses to keep track of the files that can be undeleted). So when you run rbh-undelete
once again, the first entry is successfully undeleted, and the next one fails because it has the same name. Note that the fid's entry that rbh-undelete reports is not the same between the first call, and the second one.
Now things can get even worse (meaning you can't undelete anything) if the the robinhood process is killed before it updates its SOFT_RM table and after the rebind command completes. If this happens, the first entry in the SOFT_RM table cannot be restored because the rebind will fail, and llapi_hsm_import()
will fail for the following entries with file exists
.
I do understand the implications of what happens - the bug from my perspective is that some operations in the rbh-undelete need to be atomic and a ^C need to be prevented until all operations that are finished or rolled back.
Another related scenario is when the rebind command returns an error - for example if the backend is down:
[root@tclin01 bin]# rm red
rm: remove regular file ‘red’? y
[root@tclin01 bin]# rbh-undelete -L ./red
Using config file '/etc/robinhood.d/gdatat1.conf'.
2017/02/28 09:00:35 [23859/1] CheckFS | '/gdatat1' matches mount point '/gdatat1', type=lustre, fs=192.168.1.11@o2ib:/gdatat1
rm_time, id, type, user, group, size, last_mod, lhsm.status, path
2017/02/28 09:00:30, [0x200001b7e:0x182f5:0x0], file, root, root, 89, 2014/06/10 13:54:12, synchro, /gdatat1/siteAB/root5/root/usr/bin/red
[root@tclin01 bin]# rbh-undelete -R ./red
Using config file '/etc/robinhood.d/gdatat1.conf'.
2017/02/28 09:00:45 [23872/1] CheckFS | '/gdatat1' matches mount point '/gdatat1', type=lustre, fs=192.168.1.11@o2ib:/gdatat1
2017/02/28 09:00:45 [23872/1] lhsm | Rebinding [0x200001b7e:0x182f5:0x0] to [0x200002340:0x98bd:0x0] in archive
2017/02/28 09:00:45 [23872/1] lhsm | Executing rebind command: ssh tdmfha /root/lhsmtool_dmf_rebase 0x200001b7e:0x182f5:0x0 0x200002340:0x98bd:0x0 /mnt/hsm_gdatat1_3
2017/02/28 09:00:46 [23872/1] lhsm | Failed to rebind entry in backend: command failed
Restoring '/gdatat1/siteAB/root5/root/usr/bin/red'... ERROR
undelete summary:
0 files
0 old version
0 empty files
0 non-files
0 no backup
1 errors
0 DB errors
I then end up with the new stub that is not rebound on the backend:
[root@tclin01 bin]# ls -la red
-rwxr-xr-x. 1 root root 89 Jun 10 2014 red
[root@tclin01 bin]# lfs hsm_state red
red: (0x0000000d) released exists archived, archive_id:3
[root@tclin01 bin]# lfs path2fid red
[0x200002340:0x98bd:0x0]
[root@tclin01 bin]# rbh-find -ls ./red
Using config file '/etc/robinhood.d/gdatat1.conf'.
[0x200002340:0x98bd:0x0] file rwxr-xr-x 1 root root 89 2014/06/10 13:54:12 ./red
[root@tclin01 bin]# rbh-undelete -L ./red
Using config file '/etc/robinhood.d/gdatat1.conf'.
2017/02/28 09:08:57 [25329/1] CheckFS | '/gdatat1' matches mount point '/gdatat1', type=lustre, fs=192.168.1.11@o2ib:/gdatat1
rm_time, id, type, user, group, size, last_mod, lhsm.status, path
2017/02/28 09:00:30, [0x200001b7e:0x182f5:0x0], file, root, root, 89, 2014/06/10 13:54:12, synchro, /gdatat1/siteAB/root5/root/usr/bin/red
[root@tclin01 bin]# md5sum red
md5sum: red: No data available
But removing that stub creates another entry in the SOFT_RM table and again we might not be able to restore the file:
[root@tclin01 bin]# rm red
rm: remove regular file ‘red’? y
[root@tclin01 bin]# rbh-undelete -L ./red
Using config file '/etc/robinhood.d/gdatat1.conf'.
2017/02/28 09:13:21 [26000/1] CheckFS | '/gdatat1' matches mount point '/gdatat1', type=lustre, fs=192.168.1.11@o2ib:/gdatat1
rm_time, id, type, user, group, size, last_mod, lhsm.status, path
2017/02/28 09:00:30, [0x200001b7e:0x182f5:0x0], file, root, root, 89, 2014/06/10 13:54:12, synchro, /gdatat1/siteAB/root5/root/usr/bin/red
2017/02/28 09:13:19, [0x200002340:0x98bd:0x0], file, root, root, 89, 2014/06/10 13:54:12, released, /gdatat1/siteAB/root5/root/usr/bin/red
[root@tclin01 bin]# rbh-undelete -R ./red
Using config file '/etc/robinhood.d/gdatat1.conf'.
2017/02/28 09:13:48 [26030/1] CheckFS | '/gdatat1' matches mount point '/gdatat1', type=lustre, fs=192.168.1.11@o2ib:/gdatat1
2017/02/28 09:13:48 [26030/1] lhsm | Rebinding [0x200001b7e:0x182f5:0x0] to [0x200002340:0x98be:0x0] in archive
2017/02/28 09:13:48 [26030/1] lhsm | Executing rebind command: ssh tdmfha /root/lhsmtool_dmf_rebase 0x200001b7e:0x182f5:0x0 0x200002340:0x98be:0x0 /mnt/hsm_gdatat1_3
2017/02/28 09:13:48 [26030/1] lhsm | Failed to rebind entry in backend: command failed
Restoring '/gdatat1/siteAB/root5/root/usr/bin/red'... ERROR
2017/02/28 09:13:48 [26030/1] llapi | error on ioctl 0x4008669a for '/gdatat1/siteAB/root5/root/usr/bin/red' (4): stripe already set
2017/02/28 09:13:48 [26030/1] llapi | cannot create '/gdatat1/siteAB/root5/root/usr/bin/red' for import
2017/02/28 09:13:48 [26030/1] lhsm | Failed to import file '/gdatat1/siteAB/root5/root/usr/bin/red': File exists
Restoring '/gdatat1/siteAB/root5/root/usr/bin/red'... ERROR
undelete summary:
0 files
0 old version
0 empty files
0 non-files
0 no backup
2 errors
0 DB errors
So these are all scenarios where something breaks in the middle of the operations. If the rebind command fails, rolling back the generations of the stub is probably the easiest solution.
@qb-cea Regarding your comment on the command
[root@tclin01 siteAB]# rm fc
rm: remove regular file ‘fc’? y
and what happens in robinhood in the scenario when it has already been archived, I'm a little confused. Are you saying that it is expected behavior for Robinhood to create a new file in the backend hsm without updating the link that points to it?
I'm noticing that if I run rbh-undelete
on an archived file that has been deleted from lustre, a new hsm copy is made, but this breaks the symbolic link that was originally created in the hsm/shadow directory. So from that point on, the backend can't restore the file if it is deleted yet again. Is this expected? This feels like it's a critical issue to me...unless I'm simply misunderstanding the purpose of the command.
[Is it] expected behavior for Robinhood to create a new file in the backend hsm without updating the link that points to it?
No, this is Lustre specific. Actually, it is lhsmtool_posix
specific (lhsmtool_posix
is the copytool packaged with Lustre's sources). Robinhood is only expected to create a new file in the filesystem it monitors.
In Lustre, each file is assigned a unique fid
(something that looks like this 0x200001b7e:0x182f5:0x0
). When an archived file is deleted from Lustre and Robinhood undeletes it, a new file has to be created (in Lustre). Consequently, the new file's fid
will be different from the original file's. Robinhood then has to use a rebind
command to link the archived data (in the backend) to the newly created file.
If I remember correctly, lhsmtool_posix
uses Lustre's fid
to compute the path and name of the archived data, so the rebind
command essentially consist of a mv
(or a cp
, I don't know exactly) from the old path to the new one.
I'm noticing that if I run rbh-undelete on an archived file that has been deleted from lustre, a new hsm copy is made, but this breaks the symbolic link that was originally created in the hsm/shadow directory.
I am not familiar enough with how lhsmtool_posix
handles file archiving nor with how exactly it uses hsm/shadow
to draw any conclusion. Yet, I can tell you the way data is organized in the backend is highly correlated to which copytool you use. I hope my previous answer also addresses this issue, but if you think you found an issue with lhsmtool_posix
, feel free to open a ticket about it.
Restoring several test files and interrupting that with ^C:
The expectation would be that I can rerun the command to restore the remaining files:
But that file is still in the database:
The file does exist, but the rebinding was interrupted.
And the recovery becomes a bit messy and still leaves a database entry behind: