Open amanrique1 opened 3 months ago
You may look at DNs and timestamps of these datasets and raise your concern with the data-ops team. But I doubt this issue is relevant for this repository.
In data-ops, we have noticed many cases where PnR invalidates datasets[1], and the files stay valid. The update of these inner files can be done on our side, but it would be more straightforward to do it directly on DBS when invalidating the dataset, at least for future cases.
hi @amanrique1 so far it has always been the responsibility for the people who invalidate the files in the data management system, to take the relevant actions in the book keeping system as well. In the past it was mostly done by PnR. I agree we may benefit a lot from a system where we can automate this process and keep the two system better in sync, but if it is not supposed to be done in the tools used by the relevant team to do the invalidation in parallel and this functionality requirement gets pushed upstream to the server side, then the project is much bigger than just automating an action. It may even involve cross database checks between Rucio and DBS ... This kind of a project has been discussed in the past and is in our radar for sure.
Hi @todor-ivanov, I think I didn't explain myself well. The idea of this ticket is to get internal DBS consistency. If PnR invalidates a dataset, all its files get invalid as well. In Data Ops, we are working on a separate project for the Rucio-DBS sync.
While analyzing Rucio and DBS inconsistencies, the DM team discovered many valid files whose datasets were declared as invalid or access type deleted. Is there any reason for this behavior? Some examples are