phobos-storage / phobos

This repository holds the source code for Phobos, a Parallel Heterogeneous Object Store.
GNU Lesser General Public License v2.1
3 stars 2 forks source link

phobos drive del and scsi reservation #2

Open thiell opened 7 months ago

thiell commented 7 months ago

Really minor but reporting just to not forget: our LTO-9 drives are accessible from multiple hosts, and when deleting a drive with phobos drive del ... from a host and adding it to another with phobos drive add ..., then this drive won't work and LTFS complains about an existing SCSI reservation.

When phobosd is trying to use the drive from the other server, we can see errors like that:

Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.853830000 <VERBOSE> fdcb LTFS30250I Opened the SCSI tape device 1.0.2.0 (/dev/sg4).
Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.853838000 <VERBOSE> fdcb LTFS30207I Vendor ID is IBM     .
Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.853843000 <VERBOSE> fdcb LTFS30208I Product ID is ULTRIUM-TD9     .
Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.853849000 <VERBOSE> fdcb LTFS30214I Firmware revision is Q3F4.
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS30250I Opened the SCSI tape device 1.0.2.0 (/dev/sg4).
Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.853855000 <VERBOSE> fdcb LTFS30215I Drive serial is 10210057FB.
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS30207I Vendor ID is IBM     .
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS30208I Product ID is ULTRIUM-TD9     .
Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.853915000 <VERBOSE> fdcb LTFS30285I The reserved buffer size of /dev/sg4 is 1048576.
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS30214I Firmware revision is Q3F4.
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS30215I Drive serial is 10210057FB.
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS30285I The reserved buffer size of /dev/sg4 is 1048576.
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS30205I RSOC (0xa3) returns -20601.
Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.854879000 <VERBOSE> fdcb LTFS30205I RSOC (0xa3) returns -20601.
Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.854892000 <VERBOSE> fdcb LTFS30263I RSOC returns Not Ready to Ready Transition, Medium May Have Changed (-20601) /dev/sg4.
Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.854901000 <VERBOSE> fdcb LTFS30262I Forcing drive dump.
Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.854906000 <VERBOSE> fdcb LTFS39802W Unknown SCSI OP code 0x1d, use default timeout.
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS30263I RSOC returns Not Ready to Ready Transition, Medium May Have Changed (-20601) /dev/sg4.
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS30262I Forcing drive dump.
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS39802W Unknown SCSI OP code 0x1d, use default timeout.
Nov 09 21:41:35 elm-ent-dm02 kernel: st 1:0:2:0: Mode parameters changed
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS30205I FORCE_DUMP (0x1d) returns -20604.
Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.860636000 <VERBOSE> fdcb LTFS30205I FORCE_DUMP (0x1d) returns -20604.
Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.860655000 <VERBOSE> fdcb LTFS30263I FORCE_DUMP returns Mode Parameters Changed (-20604) /dev/sg4.
Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.860662000 <VERBOSE> fdcb LTFS30262I Forcing drive dump.
Nov 09 21:41:35 elm-ent-dm02 phobosd[64958]: 2023-11-09 21:41:35.860668000 <VERBOSE> fdcb LTFS39802W Unknown SCSI OP code 0x1d, use default timeout.
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS30263I FORCE_DUMP returns Mode Parameters Changed (-20604) /dev/sg4.
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS30262I Forcing drive dump.
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS39802W Unknown SCSI OP code 0x1d, use default timeout.
Nov 09 21:41:35 elm-ent-dm02 kernel: st 1:0:2:0: reservation conflict
Nov 09 21:41:35 elm-ent-dm02 ltfs[64971]: fdcb LTFS30205I FORCE_DUMP (0x1d) returns -21719.

Especially this one I guess:

Nov 09 21:41:35 elm-ent-dm02 kernel: st 1:0:2:0: reservation conflict

A solution is to release the SCSI reservation on the original server with the following command:

# ltfs -o release_device -o devname=/dev/sg4 
126e LTFS14000I LTFS starting, LTFS version 2.4.5.1 (Prelim), log level 2.
126e LTFS14058I LTFS Format Specification version 2.4.0.
126e LTFS14104I Launched by "ltfs -o release_device -o devname=/dev/sg4".
126e LTFS14105I This binary is built for Linux (x86_64).
126e LTFS14106I GCC version is 11.3.1 20221121 (Red Hat 11.3.1-4).
126e LTFS17087I Kernel version: Linux version 5.14.0-284.25.1.el9_2.x86_64 (mockbuild@iad1-prod-build001.bld.equ.rockylinux.org) (gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), GNU ld version 2.35.2-37.el9) #1 SMP PREEMPT_DYNAMIC Wed Aug 2 14:53:30 UTC 2023 i386.
126e LTFS17089I Distribution: Rocky Linux release 9.2 (Blue Onyx).
126e LTFS17089I Distribution: NAME="Rocky Linux".
126e LTFS17089I Distribution: Rocky Linux release 9.2 (Blue Onyx).
126e LTFS17089I Distribution: Rocky Linux release 9.2 (Blue Onyx).
126e LTFS14063I Sync type is "time", Sync time is 300 sec.
126e LTFS17085I Plugin: Loading "sg" tape backend.
126e LTFS17085I Plugin: Loading "unified" iosched backend.
126e LTFS14095I Set the tape device write-anywhere mode to avoid cartridge ejection.
126e LTFS30209I Opening a device through sg-ibmtape driver (/dev/sg4).
126e LTFS30250I Opened the SCSI tape device 1.0.2.0 (/dev/sg4).
126e LTFS30207I Vendor ID is IBM     .
126e LTFS30208I Product ID is ULTRIUM-TD9     .
126e LTFS30214I Firmware revision is Q3F4.
126e LTFS30215I Drive serial is 10210057FB.
126e LTFS30285I The reserved buffer size of /dev/sg4 is 1048576.
126e LTFS30294I Setting up timeout values from RSOC.
126e LTFS17160I Maximum device block size is 1048576.
126e LTFS12022I Unloading medium.
126e LTFS30252I Logical block protection is disabled.

After that, the drive can be used from the other server by phobos.

Perhaps phobos drive del could do that automatically? Or a note in the documentation about that would be less confusing.

SebaGougeaud commented 7 months ago

Hi @thiell, For now, the phobos drive del only deals with the database. We may add this information in the documentation. We are currently thinking of adding a drive_release-like feature for the 2.1 version, which is planned for June 2024.

thiell commented 2 months ago

@SebaGougeaud We now think that when stopping phobosd, the daemon should release its drives, otherwise there is no way multiple phobos instances can properly recover without sysadmin intervention to release the drives. Imagine a scenario with a first data mover dm01 with phobosd, that we stop for maintenance, tapes mounted in the drives. If the daemon does not release the drives when stopping, the other data movers (for example dm[02-03]) will fail trying to grab the tapes previously mounted by dm01, and that will fail both the mounted tapes and the drives on the other data movers dm[02-03]. Please let me know if there is a case the daemon should not release its own drives when stopping... thanks!

patlucas commented 2 months ago

@thiell What do you mean by the daemon should "release" its drives ? Do you mean removing any phobos DSS lock ? or do you mean umounting and unloading any tape from any of its drives ? Or any thing else ?

thiell commented 2 months ago

@patlucas: Good question indeed, I mean both phobos DSS lock (lock remaining in the lock table after phobosd being stopped) and also the LTFS device reservation that can be released with ltfs -o release_device. That way, after phobosd has been stopped, the cartridge (still in the drive) can be taken over by another data mover / phobosd instance. Otherwise, this leads to a deadlock situation. I will try to provide relevant logs with the new phobos version (based on current master), but I have some compatibility issues with lhsmtool_phobos / coordinatool right now and can't make it work yet.

patlucas commented 2 months ago

As already said, we plan to add an admin command "phobos drive release" to manage the ltfs device reservation. This feature is planned in the phobos 3.0 milestone. We are currently finishing phobos 2.0.

Migration of a drive need an admin command because drives are currently dedicated to a node and this is registered in the DSS.

Migration of a drive from one node to an other will be redesign and taken into account through admin commands in phobos 3.0 .

thiell commented 1 month ago

@patlucas ok no problem for the drives and phobos 3.0, but would you also be releasing the ltfs device reservation when the phobosd daemon stops? For now, we can put a ExecStopPost that would always release ltfs device reservation (otherwise, the tape in the drive cannot be reclaimed by other phobosd).

@patlucas What about the DSS lock release when phobosd is stopped?

For example here we stopped phobosd on elm-ent-dm01 (this is with 1.95.1 not master):

May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.995937000 <ERROR> Media '054840L9' is locked by (hostname: elm-ent-dm01, owner: 3688211): Operation already in progress (114)
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.995954000 <ERROR> Device '/dev/sg5' (S/N '10230057FB') is owned by host elm-ent-dm02 but contains medium '054840L9' which is locked by an other hostname elm-ent-dm01: Operation already in progress (114)
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.995961000 <ERROR> Fail to init device '/dev/sg5', stopping corresponding device thread: Operation already in progress (114)
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.995980000 <ERROR> setting medium '054840L9' to failed
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.998588000 <ERROR> Request failed: PHLK2: Permission denied (13)
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.998594000 <ERROR> Error when releasing medium '054840L9' with current lock (hostname elm-ent-dm01, owner 3688211): Permission denied (13)
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.998597000 <ERROR> Error when releasing medium 054840L9 after setting it to status failed: Permission denied (13)
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.998599000 <ERROR> setting device '10230057FB' to failed
patlucas commented 1 month ago

We will indeed try to release the ltfs reservation through a phobos admin command and clean DSS locks.

thiell commented 1 month ago

Awesome, thanks @patlucas, I appreciate your quick answers!

courrierg commented 1 month ago

@patlucas ok no problem for the drives and phobos 3.0, but would you also be releasing the ltfs device reservation when the phobosd daemon stops? For now, we can put a ExecStopPost that would always release ltfs device reservation (otherwise, the tape in the drive cannot be reclaimed by other phobosd).

@patlucas What about the DSS lock release when phobosd is stopped?

For example here we stopped phobosd on elm-ent-dm01 (this is with 1.95.1 not master):

May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.995937000 <ERROR> Media '054840L9' is locked by (hostname: elm-ent-dm01, owner: 3688211): Operation already in progress (114)
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.995954000 <ERROR> Device '/dev/sg5' (S/N '10230057FB') is owned by host elm-ent-dm02 but contains medium '054840L9' which is locked by an other hostname elm-ent-dm01: Operation already in progress (114)
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.995961000 <ERROR> Fail to init device '/dev/sg5', stopping corresponding device thread: Operation already in progress (114)
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.995980000 <ERROR> setting medium '054840L9' to failed
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.998588000 <ERROR> Request failed: PHLK2: Permission denied (13)
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.998594000 <ERROR> Error when releasing medium '054840L9' with current lock (hostname elm-ent-dm01, owner 3688211): Permission denied (13)
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.998597000 <ERROR> Error when releasing medium 054840L9 after setting it to status failed: Permission denied (13)
May  1 11:24:54 elm-ent-dm02 phobosd[5081]: 2024-05-01 11:24:54.998599000 <ERROR> setting device '10230057FB' to failed

phobosd should not leave DSS locks on the media it uses unless an error occurred. It would be interesting to see the logs of phobosd when it stops. Either you have an error message that indicates that phobosd did not release the lock or there is a bug.

There was some refactoring of that part of the code. Master is in relatively unstable position right now. The rest of the patches that should fix the bugs is partially integrated and the rest will soon be. Hopefully, by the end of the day everything will be pushed to master. (There is a new health feature that can be configured through the max_health parameter that is coming with it).