Closed lemora closed 1 month ago
What does it mean "pool with restore"? Does this mean a pool that has restore in queue? If it goes down all restores are gone and no replica of that file there. The issue, I thought, was with just a pool going down with all the replicas on it unavailable. Users try to access the files, they trigger stage. Meanwhile pool comes back up, but none of the files being staged can be read because there are staging requests in PM (Issue #7587)
Well, the phrasing is perhaps a bit convoluted. We also have the issue that we are not able to cancel ongoing restores on the tape side, but that might be hard to address.
Nobody had opened the agreed-upon issue when I checked yesterday, so I created this one. The second part (with the reference to RequestContainerV5
) is now redundant to https://github.com/dCache/dcache/issues/7587, of course.
I'll close this issue and we can follow-up in the other one.
Similarly, when a request in dCache is cancelled that has already trigered a restore to be started on the tape system side, this can result in a second restore because dCache is no longer aware of/tracking the first one.
Additionally, when trying to access an existing disk copy after a pool with the file comes up, PoolManager refuses to serve it because it sees that a restore is ongoing and assumes that there is no replica on disk.
The second issue probably need to be fixed in https://github.com/dCache/dcache/blob/052d56b970cadbba2ac3e212870eaf50cb13883b/modules/dcache/src/main/java/diskCacheV111/poolManager/RequestContainerV5.java#L833