Closed ericvaandering closed 1 year ago
Having implemented the functionality described in the proposal, I added the "disabled" CE run state to the monitor.
I used T2_AT_Vienna as a test site once. I disabled the RSE using the boolean RSE attribute CE_config.ce_disabled
, I started the run for the RSE manually and now it is shown as disabled in the monitor:
Summary page: https://cmsweb.cern.ch/rucioconmon/ce/index?view=-ce_run RSE page: https://cmsweb.cern.ch/rucioconmon/ce/show_rse?rse=T2_AT_Vienna
Hello @ivmfnal Can you please review the script/command that you used to set/delete the rse-attributes for Vienna? Apparently, it deleted all rse-attributes that were set for the site. This caused the site to go down since the changes, https://cmssst.web.cern.ch/siteStatus/detail.html?site=T2_AT_Vienna . This was the RSE config at 1030 hrs.
❯ cat vienna_site_config_0ct9_2023_1030hrs
Settings:
=========
availability: 7
availability_delete: True
availability_read: True
availability_write: True
credentials: None
delete_protocol: 1
deterministic: True
domain: ['lan', 'wan']
id: 0465a108757a4acb8f2f18085b8e25f7
lfn2pfn_algorithm: cmstfc
qos_class: None
read_protocol: 1
rse: T2_AT_Vienna
rse_type: DISK
sign_url: None
staging_area: False
third_party_copy_read_protocol: 1
third_party_copy_write_protocol: 1
verify_checksum: True
volatile: False
write_protocol: 1
Attributes:
===========
CE_cfg.ce_disabled: True
rule_approvers: useren,nhoerman,ebirngru
site_admins: useren,nhoerman,ebirngru
Protocols:
==========
davs
domains: '{"lan": {"read": 0, "write": 0, "delete": 0}, "wan": {"read": 1, "write": 1, "delete": 1, "third_party_copy_read": 1, "third_party_copy_write": 1}}'
extended_attributes: None
hostname: eos.grid.vbc.ac.at
impl: rucio.rse.protocols.gfal.Default
port: 8443
prefix: /eos/vbc/experiments/cms/
scheme: davs
root
domains: '{"lan": {"read": 0, "write": 0, "delete": 0}, "wan": {"read": 3, "write": 3, "delete": 3, "third_party_copy_read": 3, "third_party_copy_write": 3}}'
extended_attributes: None
hostname: eos.grid.vbc.ac.at
impl: rucio.rse.protocols.gfal.Default
port: 1094
prefix: //eos/vbc/experiments/cms/
scheme: root
Usage:
======
expired
files: 30403
free: None
rse: T2_AT_Vienna
rse_id: 0465a108757a4acb8f2f18085b8e25f7
source: expired
total: 178245012608213
updated_at: 2023-10-09 08:20:28
used: 178245012608213
obsolete
files: 4
free: None
rse: T2_AT_Vienna
rse_id: 0465a108757a4acb8f2f18085b8e25f7
source: obsolete
total: 3069665831
updated_at: 2023-10-09 08:20:19
used: 3069665831
rucio
files: 196263
free: None
rse: T2_AT_Vienna
rse_id: 0465a108757a4acb8f2f18085b8e25f7
source: rucio
total: 446405914085328
updated_at: 2023-10-08 20:16:51
used: 446405914085328
static
files: None
free: None
rse: T2_AT_Vienna
rse_id: 0465a108757a4acb8f2f18085b8e25f7
source: static
total: 500000000000000
updated_at: 2022-09-19 09:46:47
used: 500000000000000
unavailable
files: 9
free: None
rse: T2_AT_Vienna
rse_id: 0465a108757a4acb8f2f18085b8e25f7
source: unavailable
total: 4846303858
updated_at: 2023-10-09 10:30:02
used: 4846303858
RSE limits:
===========
MinFreeSpace: 50000000000000 B
Are you saying you used the suggested command and it removed all the attributes ? What command are you talking about ?
I see some attributes in the output you included:
Attributes:
===========
CE_cfg.ce_disabled: True
rule_approvers: useren,nhoerman,ebirngru
site_admins: useren,nhoerman,ebirngru
Hello, no, I did not.
But your testing of the RSE attribute seems to coincide with the disappearance of other attributes and thus errors on the site capacity page.
The rule_approvers
and site_admins
are added by the sync scripts. That is why we see them.
@dynamic-entropy Will the sync scripts or any other automated procedures override CE_config.* attributes ?
No, the sync scripts set some attributes like rule_approvers: and site_admins.
@dynamic-entropy probably I accidentally wiped those attributes earlier some time this weekend while debugging. Current tool which I use does not do blanket removals.
Alright, no worries. Just wanted to make sure, we do not have side effects.
Seems this is a not a bug.
Here is a proposal how to pus some CE configuration into Rucio configuration and keep the configuration file: https://docs.google.com/document/d/1ynvM-qZlRMPL3DZvnpAjfykBjrsTWuSm0DlnpAjTM2g/edit?usp=sharing
It is documented here: https://github.com/ivmfnal/cms_consistency/blob/master/site_cmp3/README.rst#parameters-controlled-by-site-admin
This is actually already implemented and in production.