cernopendata / opendata.cern.ch

Source code for the CERN Open Data portal
http://opendata.cern.ch/
GNU General Public License v2.0
666 stars 148 forks source link

CMS: check /HIAllPhysics/HIRun2010-ZS-v2/RECO file content #3683

Open tiborsimko opened 2 months ago

tiborsimko commented 2 months ago

The record https://opendata.cern.ch/record/14011 says:

    "number_events": 75482193,
    "number_files": 28954,
    "size": 166030973333340

CMS DAS for /HIAllPhysics/HIRun2010-ZS-v2/RECO says:

        "nevents": 79329398,
        "nfiles": 32715,
        "size": 190908704707416

The record file index says:

This number of files, the total size, and the listing of individual 32715 file paths from the record index file is fully consistent with what is being reported by CMS DAS.

However, the actual directory content on EOSPUBLIC under /eos/opendata/cms/hidata/HIRun2010/HIAllPhysics/RECO/ZS-v2/ shows:

1) We should check and update total number of events and files as appropriate.

2) There appears to be 1115 missing files; the full list is at https://simko.web.cern.ch/tmp/recid-14011-missing-files.txt

tiborsimko commented 1 month ago

@katilp @jmhogan @tpmccauley Can you please have a look at the list of missing files above? We might need to schedule data transfer to recover these files from the grid?

jmhogan commented 1 month ago

@tiborsimko thanks for the poke, I had missed this in my spam folder (I've got filtering set up now, hopefully that happens less).

It does look like these missing files are only reported as existing on T1_US_FNAL_TAPE, so perhaps their transfer didn't complete successfully (the DAS record seems to suggest that).

@katilp Do I have permission to set up Rucio rules as DPOA? I haven't done it before, so I'm not sure.

katilp commented 1 month ago

@jmhogan Sorry for the delay, it should all be in https://cms-opendata-releaseguide.docs.cern.ch/transfer/site_setup/ Let me check.

katilp commented 1 month ago

@jmhogan I've added you to t3_ch_cern_opendata_local

jmhogan commented 1 month ago

Missing blocks are:

None of them have a rule placing them at our site, so I think I just need to make 7 new rules for these particular blocks.

jmhogan commented 1 month ago

Rules:

bash-4.4$ rucio add-rule cms:/HIAllPhysics/HIRun2010-ZS-v2/RECO#2c90d627-4424-4126-869a-840defda80f1 1 T3_CH_CERN_OpenData 5d3b53e10ca34e868923fad580f99dc9 bash-4.4$ rucio add-rule cms:/HIAllPhysics/HIRun2010-ZS-v2/RECO#bebc4ce9-7fad-411b-a789-d9b981e46f6f 1 T3_CH_CERN_OpenData 92373b0f64674b1c8ed6bd4b0cb4093a bash-4.4$ rucio add-rule cms:/HIAllPhysics/HIRun2010-ZS-v2/RECO#52656ab9-000d-4b32-ae37-6e60f0dc48e9 1 T3_CH_CERN_OpenData 63d6f7ce6daa4248adbbd8f4d6bd7a73 bash-4.4$ rucio add-rule cms:/HIAllPhysics/HIRun2010-ZS-v2/RECO#51e3f795-4299-4bf9-b1d1-75064bddb15d 1 T3_CH_CERN_OpenData 48cbd8bcd7974cb0a0714e3d4a31cd49 bash-4.4$ rucio add-rule cms:/HIAllPhysics/HIRun2010-ZS-v2/RECO#50e679df-4d8c-4a6e-82dd-80cf5bf69d62 1 T3_CH_CERN_OpenData 70eaa77fd2ff41a79a5d588591fbd79e bash-4.4$ rucio add-rule cms:/HIAllPhysics/HIRun2010-ZS-v2/RECO#3953b27d-5e00-444b-9621-f007c8f6c7f9 1 T3_CH_CERN_OpenData 70d16e65cc7c4d209ebd62236545a3b8 bash-4.4$ rucio add-rule cms:/HIAllPhysics/HIRun2010-ZS-v2/RECO#3852bf2b-3173-4df9-b267-f3a276a7ccdb 1 T3_CH_CERN_OpenData 905f6160f5c140d6bc7294aa116833e9