XENONnT / outsource

Job submission of reprocessing
4 stars 2 forks source link

Failure when trying to download a rucio-corrupted file #137

Open FaroutYLq opened 7 months ago

FaroutYLq commented 7 months ago

See this for example: /scratch/yuanlq/workflows/runs/straxen_v2.2.1/xenonnt_offline/daniel_20240409/00/03/events_ID0000291.out.002. The failure happened here:

        strax.storage.common.DataCorrupted: Cannot open metadata for xnt_051694:peak_positions_cnn-iongm54rho

The issue is that this rule xnt_051694:peak_positions_cnn is incomplete. See this

(XENONnT_development) yuanlq@ap23:/scratch/yuanlq/workflows/runs/straxen_v2.2.1/xenonnt_offline/daniel_20240409/00/03$ rucio list-rules xnt_051694:peak_positions_cnn-iongm54rho
/cvmfs/xenon.opensciencegrid.org/releases/nT/development/anaconda/envs/XENONnT_development/bin/rucio:4: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  __import__('pkg_resources').run_script('rucio-clients==32.8.0', 'rucio')
ID                                ACCOUNT     SCOPE:NAME                                STATE[OK/REPL/STUCK]    RSE_EXPRESSION      COPIES    SIZE    EXPIRES (UTC)    CREATED (UTC)
--------------------------------  ----------  ----------------------------------------  ----------------------  ------------------  --------  ------  ---------------  -------------------
e3667ddd171242b3a46121928bda4f09  production  xnt_051694:peak_positions_cnn-iongm54rho  OK[3/0/0]               UC_MIDWAY_USERDISK  1         N/A                      2024-04-07 06:54:55
(XENONnT_development) yuanlq@ap23:/scratch/yuanlq/workflows/runs/straxen_v2.2.1/xenonnt_offline/daniel_20240409/00/03$ rucio list-files xnt_051694:peak_positions_cnn-iongm54rho
/cvmfs/xenon.opensciencegrid.org/releases/nT/development/anaconda/envs/XENONnT_development/bin/rucio:4: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  __import__('pkg_resources').run_script('rucio-clients==32.8.0', 'rucio')
+-------------------------------------------------+--------------------------------------+-------------+------------+----------+
| SCOPE:NAME                                      | GUID                                 | ADLER32     | FILESIZE   | EVENTS   |
|-------------------------------------------------+--------------------------------------+-------------+------------+----------|
| xnt_051694:peak_positions_cnn-iongm54rho-000000 | 7422C99D-721C-43EC-BE4D-1DB2E932C4D1 | ad:3eb8895b | 66.119 MB  |          |
| xnt_051694:peak_positions_cnn-iongm54rho-000002 | C3B3BC57-4F75-464F-BFF4-20D6DA9F119D | ad:cc34ae4d | 66.412 MB  |          |
| xnt_051694:peak_positions_cnn-iongm54rho-000004 | 04106F2F-674D-4042-B5F0-210582EDA52E | ad:ad6b2416 | 65.421 MB  |          |
+-------------------------------------------------+--------------------------------------+-------------+------------+----------+
Total files : 3
Total size : 197.952 MB

We want a mechanism that allow download to fail, but just modify the to process list.

FaroutYLq commented 7 months ago

Similarly

          File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/storage/common.py", line 585, in get_metadata
            raise strax.DataCorrupted(f"Cannot open metadata for {str(backend_key)}") from e
        strax.storage.common.DataCorrupted: Cannot open metadata for xnt_047451:merged_s2s-wpzg6lsm2m

in rucio

(XENONnT_development) yuanlq@ap23:/scratch/yuanlq/workflows/runs/straxen_v2.2.1/xenonnt_offline/daniel_20240409/00/01$ rucio list-files xnt_047451:merged_s2s-wpzg6lsm2m
/cvmfs/xenon.opensciencegrid.org/releases/nT/development/anaconda/envs/XENONnT_development/bin/rucio:4: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  __import__('pkg_resources').run_script('rucio-clients==32.8.0', 'rucio')
+-----------------------------------------+--------------------------------------+-------------+------------+----------+
| SCOPE:NAME                              | GUID                                 | ADLER32     | FILESIZE   | EVENTS   |
|-----------------------------------------+--------------------------------------+-------------+------------+----------|
| xnt_047451:merged_s2s-wpzg6lsm2m-000001 | A349D480-919A-4293-9389-94AAC9A418BD | ad:4dad0d51 | 104.664 MB |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000006 | 13B10ECB-89F6-447E-9811-B3F0B3D0B145 | ad:98efbf2f | 90.294 MB  |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000008 | 89C2A4FD-17AB-49EB-9FD9-F505CFD3007F | ad:f29a3675 | 103.123 MB |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000012 | BB6FFFBC-8870-4675-A1DE-FDB8AC173EC3 | ad:08b97e80 | 96.878 MB  |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000018 | B634B88A-E1F4-4EDF-98B1-B1E62EE5B472 | ad:f8469b09 | 84.926 MB  |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000022 | 277D901F-9B39-4255-B0EC-3110A0138171 | ad:7f5ef7f4 | 104.558 MB |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000023 | DEEF8EA0-88D2-4268-883E-E1C007A10E0D | ad:b94bf2c0 | 88.169 MB  |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000024 | 4602EB6F-B1EA-4F7E-AB89-FB4E05B5876E | ad:45622064 | 99.556 MB  |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000025 | F1A0C0CA-E796-4D01-838A-20BD3F97B7A9 | ad:3bde12d7 | 101.293 MB |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000030 | 97D5B5A2-F04A-4E84-8298-B81BA405A37E | ad:c895cb79 | 101.452 MB |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000031 | CE22442E-44A7-4BFC-9EEA-34BFE6C2E043 | ad:68a601d8 | 97.134 MB  |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000032 | B73EB4C6-C412-4016-B52F-FBE6B319AB2A | ad:08dcfaf2 | 91.443 MB  |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000040 | 30496F1A-B1BF-4DBB-8093-17F61BD2C2E5 | ad:cfc9ad19 | 102.889 MB |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000046 | 73CA8952-87BF-4BD7-B114-BA9A0A8BECBD | ad:61639038 | 97.875 MB  |          |
| xnt_047451:merged_s2s-wpzg6lsm2m-000047 | 3D8F97B8-5256-4401-8244-48DDD5944527 | ad:b33ef42e | 89.475 MB  |          |
+-----------------------------------------+--------------------------------------+-------------+------------+----------+
Total files : 15
Total size : 1.454 GB

In rundb:

image
FaroutYLq commented 7 months ago

Similarly /scratch/yuanlq/workflows/runs/straxen_v2.2.1/xenonnt_offline/daniel_20240409/00/00/events_ID0000083.out.002:

          File "/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.9/site-packages/strax/storage/common.py", line 585, in get_metadata
            raise strax.DataCorrupted(f"Cannot open metadata for {str(backend_key)}") from e
        strax.storage.common.DataCorrupted: Cannot open metadata for xnt_045342:peaklet_classification-p3m6pr2fhz

in rucio

(XENONnT_development) yuanlq@ap23:/scratch/yuanlq/workflows/runs/straxen_v2.2.1/xenonnt_offline/daniel_20240409/00/00$ rucio list-rules xnt_045342:peaklet_classification-p3m6pr2fhz
/cvmfs/xenon.opensciencegrid.org/releases/nT/development/anaconda/envs/XENONnT_development/bin/rucio:4: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  __import__('pkg_resources').run_script('rucio-clients==32.8.0', 'rucio')
ID                                ACCOUNT     SCOPE:NAME                                    STATE[OK/REPL/STUCK]    RSE_EXPRESSION      COPIES    SIZE    EXPIRES (UTC)    CREATED (UTC)
--------------------------------  ----------  --------------------------------------------  ----------------------  ------------------  --------  ------  ---------------  -------------------
c9620ea708a24d60a0b4e5780573101b  production  xnt_045342:peaklet_classification-p3m6pr2fhz  OK[0/0/0]               UC_MIDWAY_USERDISK  1         N/A                      2024-04-07 03:04:07

but nothing in runDB

image