dmwm / CMSRucio

7 stars 31 forks source link

Site admins cannot declare bad replicas #587

Closed dynamic-entropy closed 9 months ago

dynamic-entropy commented 1 year ago

The rucio declare_bad_file_replicas api does not pass rse_id as a kwarg.

https://github.com/rucio/rucio/blob/7f0d229ac0b3bc7dec12c6e158bea2b82d414a3b/lib/rucio/api/replica.py#L88-L90

but it has been erroneously used in the policy package to determine site_admin. https://github.com/dmwm/CMSRucio/blob/a85debbedc951d42b214564338630582c5699e58/src/policy/CMSRucioPolicy/permission.py#L732


Modify the rucio API to pass the relevant info or Create a new API endpoint to satisfy the requirements.

ivmfnal commented 1 year ago

never mind

ivmfnal commented 1 year ago

It looks like under some circumstances, declare_bad_file_replicas() has to call the policy module without rse_id. Is there a way to handle such cases ?

ivmfnal commented 1 year ago

Rucio issue is created: https://github.com/rucio/rucio/issues/6339

dynamic-entropy commented 1 year ago

It looks like under some circumstances, declare_bad_file_replicas() has to call the policy module without rse_id. Is there a way to handle such cases ?

Hi @ivmfnal

Which part of the policy module? Also, the rse_ids become part of kwargs, so this should not affect anything else. I'm sorry if I didn't understand your question. Can you give more context?

ivmfnal commented 1 year ago

never mind

dynamic-entropy commented 1 year ago

The pfn strategy does not work for CMS at the moment. We cannot do pfn -> lfn + rse. https://github.com/dmwm/CMSRucio/issues/530#issuecomment-1594712563

dynamic-entropy commented 1 year ago

It would be nice to have a fix for that along with this if you have time.

ivmfnal commented 1 year ago

so I can assume the replicas will be given by dictionaries with rse in them?

ivmfnal commented 1 year ago

Here is the PR in the main rucio repository: https://github.com/rucio/rucio/pull/6340

dynamic-entropy commented 1 year ago

We cannot assume that, if the api endpoint accepts both formats. But I see you have used a pfn_to_rse function. Does it work for cms rses?

In this case, this would solve the other problem of the wrong translation algorithm in the issue I mentioned above. #530

ivmfnal commented 1 year ago

you probably mean get_pfn_to_rse ? I am not sure if it will work for CMS, but we will see. I use it only to get list of RSEs for the list of replicas to check permissions.

dynamic-entropy commented 1 year ago

But you also, generate the replica_lst which then takes care of converting pfns -> {lfn, rse}, which then removes its ( cases where pfns are specified instead of replica list ) need from functions down the chain.

ivmfnal commented 1 year ago

No, in case them the replicas are given as PFNs, the replica_lst is not generated

dynamic-entropy commented 1 year ago

Ah, maybe I was not through while checking this. But can this not be done?

ivmfnal commented 1 year ago

I am sorry, what are you suggesting not to do ?

If you mean using get_pfn_to_rse, then it has to be done in case the replicas were given by PFNs merely to get the list of RSEs for the replicas to check permission for the user to declare bad replicas there. The code, as you can see, does not use any other information received from get_pfn_to_rse.

dynamic-entropy commented 1 year ago

Hello. I meant, can we generate, replica_lst even when pfns are provided?
So that the pfn to lfn occurs at the API level, and need not be resolved in core.

While doing this we can ensure that this pfn to lfn conversion works for all experiments. If it is not direct and simple, perhaps it should be made configurable.

This comment is not about solving "sending rse_id" to the permissions module. Your changes enable that.

ivmfnal commented 1 year ago

For the purposes of the original request - to make sure that RSE ids are passed to the permissions function correctly - there is no need to do that and I would rather not do that.

dynamic-entropy commented 1 year ago

Yes, exactly. The comment is to ask/suggest if doing so would help with the other issue.

ivmfnal commented 1 year ago

I had to start new PR: https://github.com/rucio/rucio/pull/6350

ericvaandering commented 1 year ago

I linked the Rucio PR to this.

dynamic-entropy commented 9 months ago

Thanks to the PR, a site admin can now declare a file replica bad using the python client. The declaration using CLI is still pending and shall be solved in https://github.com/dmwm/CMSRucio/issues/530