Open haozturk opened 6 months ago
Actually the expiry date of the rule is set properly, but the rule hasn't been cleaned even if it's expired 3 days ago:
$ rucio rule-info 39a6afe366144e0db800f7bb67ba2cd8
Id: 39a6afe366144e0db800f7bb67ba2cd8
Account: rvaladao
Scope: user.rvaladao
Name: /UERJ/Datasets/USER
RSE Expression: T2_BR_UERJ
Copies: 1
State: STUCK
Locks OK/REPLICATING/STUCK: 3768/568949/685
Grouping: DATASET
Expires at: 2024-04-26 19:21:33
Locked: False
Weight: None
Created at: 2024-04-24 16:54:46
Updated at: 2024-04-29 14:11:58
Error: RequestErrMsg.TRANSFER_FAILED:TRANSFER ERROR: Copy failed (3rd pull). Last attempt: Request cancellation was requested.
Subscription Id: None
Source replica expression: None
Activity: User Subscriptions
Comment: None
Ignore Quota: False
Ignore Availability: False
Purge replicas: False
Notification: NO
End of life: None
Child Rule Id: None
This is what I see in judge-cleaner:
{"process": {"pid": 9}, "@timestamp": "2024-04-29T00:00:14.233Z", "message": "[7/8]: Deleting rule 39a6afe366144e0db800f7bb67ba2cd8 with expression T2_BR_UERJ", "error": {"type": null, "message": null, "stack_trace": null}, "log": {"level": "INFO", "logger": "root"}}
{"process": {"pid": 9}, "@timestamp": "2024-04-29T00:00:14.242Z", "message": "[7/8]: Locks detected for 39a6afe366144e0db800f7bb67ba2cd8", "error": {"type": null, "message": null, "stack_trace": null}, "log": {"level": "WARNING", "logger": "root"}}
Checking further to understand what Locks detected
means
This is ORA-00054
db error whose message is resource busy and acquire with nowait specified or timeout expired
. I suspect this is related to the size of the rule (~0.5M files). It might be that the DB call is timing out or there is an active query locking the table. Will investigate further.
Bug Description
UERJ site admin reported this problem and I was able to reproduce it with transfer_ops account. See the details below. Not all rule deletions fail:
Reproduction Steps
Expected Behavior
The rule deletion above should succeed.
Possible Solution
No response
Related Issues
No response