Open thiell opened 2 weeks ago
Hello @thiell,
Phobosd is complaining here because each active drive should have a lock in the lock table of the DSS. This lock is different from the adm_status locked
. This is a concurrency lock that is used to avoid concurrent access to resources. Phobosd is complaining that the drive has no lock which is not normal. This is most likely a bug. In your case, since you used phobos drive lock
, phobosd was trying to remove this drive from its list and one of the steps to do this is to remove the DSS lock which it failed to do. In your case, this is not an issue since you locked the drive so everything should be fine. But there is definitively a bug that we need to investigate. Was this on the latest master branch?
One thing that you could do is check that all the tapes and drives that are in use do have a lock in the locks table. This is especially important for tapes otherwise several phobosd might want to use the same tape. You have to do an SQL query to list the locks unfortunately. There is no lock list
command for now. select * from lock;
should do the trick.
One possible cause for the issue might be a double unlock which we have seen in the past. To check this, you can run phobosd at the debug level and you should see the logs for all the lock and unlock operations. They are prefixed with lock:
or unlock:
for easier grep. If you see this each time you run phobos drive unlock
, this might be an indication that this is in fact a double unlock.
Hello @courrierg!
Thanks! Yes, it was with the latest master branch. I don't lock drives very often but I will report back if it happens every time.
With latest master branch, I wanted to run ltfsck on a tape that had fail:
But when I locked the corresponding drive:
I saw these phobosd errors:
However, locking seems to have worked:
It's a bit unclear to me why phobosd complained.