Closed ericvaandering closed 10 years ago
plan is at this point:
step 3. above appear correct in front of inspection of DBS3 and SiteDB https://hypernews.cern.ch/HyperNews/CMS/get/comp-ops/1674/1/1/1.html
so it is the plan now
see Eric's planning document https://www.evernote.com/shard/s48/sh/006ed13a-73bd-4a25-ac85-fae9da2728db/2b16a1a9dba95e5113dda5ace01070d9
tentative way:
import json import subprocess cmd='curl -ks "https://cmsweb.cern.ch/phedex/datasvc/json/prod/BlockReplicaSummary?dataset=/SingleMu/Run2012B-TOPMuPlusJets-22Jan2013-v1/AOD&complete=y"' j=subprocess.check_output(cmd,shell=True) dict=json.loads(j) blocks=dict['phedex']['block'] for b in blocks: bname=b['name'] print bname for r in b['replica']: site=r['node'] print site
plan better detailed:
well.. need: 2a. use new SiteDB API to go from PNN to PSN and put PSN as site name in JDL. But thiis is going to be the identity matrix still for some time, no sweat
"2a. use new SiteDB API to go from PNN to PSN and put PSN as site name in JDL"
Exactly. It's the identity matrix for Tier2s but not for Tier 1, I'm afraid, so this needs to be there.
I guess the question with #6 is if you want to do this or just educate users that if they use free-form they cannot publish anymore. I guess with no other established way to write to /store/group you cannot?
yes. Even if for T1's I can get some mileage mapping T1_ccsite* to T1_cc_site. Simply to do one step at a time. Indeed the problem with #6 is /store/group. One possiblity is to deprecate it in Crab2 and tell user to go Crab3. I am not even sure how much it is used, we do not keep up-to-date recepies for srm_paths anywhere. There are many solutions, it is only whether we want to change established functionalities.
This might be tenable. Writing the PNN into DBS is the second milestone of this project (being able use it when you find it is the first). It might be reasonable at that point to direct people who need to publish into /store/group to CRAB3.
We can defer a decision until later to be based on how many people do this and where CRAB3 is at that time w.r.t. workflows people need to run.
On Jul 31, 2014, at 14:40, Stefano Belforte notifications@github.com wrote:
One possiblity is to deprecate it in Crab2 and tell user to go Crab3. I am not even sure how much it is used, we do not keep up-to-date recepies for srm_paths anywhere. There are many solutions, it is only whether we want to change established functionalities. — Reply to this email directly or view it on GitHub.
about 2a. above, will use directly SiteDB API, here's example belforte@lxplus0065/TESTCRAB> curl -ks --cert $X509_USER_PROXY --key $X509_USER_PROXY "https://cmsweb.cern.ch/sitedb/data/prod/data-processing" |head -10 {"desc": {"columns": ["phedex_name", "psn_name", "site_name"]}, "result": [ ["T2_HU_Budapest", "T2_HU_Budapest", "BUDAPEST"] ,["T2_CN_Beijing", "T2_CN_Beijing", "BEIJING-LCG2"] ,["T2_PL_Warsaw", "T2_PL_Warsaw", "ICM"] ,["T3_IT_Napoli", "T3_IT_Napoli", "INFN-NAPOLI-CMS"] ,["T2_IT_Pisa", "T2_IT_Pisa", "INFN-PISA"] ,["T3_US_UIowa", "T3_US_UIowa", "GROW-PROD"] ,["T2_CH_CSCS", "T2_CH_CSCS", "CSCS-LCG2"] ,["T3_US_Princeton", "T3_US_Princeton", "Princeton"] ,["T2_FI_HIP", "T2_FI_HIP", "FI_HIP_T2"]
SiteDB API requires login with cms VOMS, so make sure X509_USER_PROXY is defined #1144
Now can do:
import subprocess import cjson import os cmdS='curl -ks --cert $X509_USER_PROXY --key $X509_USER_PROXY "https://cmsweb.cern.ch/sitedb/data/prod/data-processing"' cj=subprocess.check_output(cmdS,shell=True) dataProcessingDict=cjson.decode(cj) pnn2psn={} for s in dataProcessingDict['result']: pnn2psn[s[0]]=s[1]
print pnn2psn['T2_IT_Pisa']
update: retrieve pnn from phedex done pnn2psn from sitedb done passing list of psn to JDL done
need to:
I guess PNN names from local DBS is already done? or is that part of #2?
On Aug 12, 2014, at 10:41, Stefano Belforte notifications@github.com wrote:
update: retrieve pnn from phedex done pnn2psn from sitedb done passing list of psn to JDL done
need to:
- implement black/white lists
- deal with SE names in local scope dbs using new pheded API
- report PNN when publishing
— Reply to this email directly or view it on GitHub.
that's exactly #2: get an SE as origin_site and convert to PNN
On 08/12/2014 05:47 PM, Eric Vaandering wrote:
I guess PNN names from local DBS is already done? or is that part of #2?
No, it’s not. What if the origin_site_name contains a PNN already?
On Aug 12, 2014, at 10:50, Stefano Belforte notifications@github.com wrote:
that's exactly #2: get an SE as origin_site and convert to PNN
On 08/12/2014 05:47 PM, Eric Vaandering wrote:
I guess PNN names from local DBS is already done? or is that part of #2? — Reply to this email directly or view it on GitHub.
yes, I think I'll do something like if origin_site_name starts with T[0-4] then assume it is a PNN else assume it is an SE host name
On 08/12/2014 06:21 PM, Eric Vaandering wrote:
No, it’s not. What if the origin_site_name contains a PNN already?
Stefano Belforte - I.N.F.N. tel : +39 040 375-6261 (fax: 375-6258) Area di Ricerca - Padriciano 99 tel mobile: +39 328 010 7327 34012 TRIESTE TS - Italy AIM: stefanobelforte
You could steal cmsname reflex from lexicon.py
Sent from a mobile device. Please excuse my brevity or transcription errors.
On Aug 12, 2014, at 11:38, "Stefano Belforte" notifications@github.com<mailto:notifications@github.com> wrote:
yes, I think I'll do something like if origin_site_name starts with T[0-4] then assume it is a PNN else assume it is an SE host name
On 08/12/2014 06:21 PM, Eric Vaandering wrote:
No, it’s not. What if the origin_site_name contains a PNN already?
Stefano Belforte - I.N.F.N. tel : +39 040 375-6261 (fax: 375-6258) Area di Ricerca - Padriciano 99 tel mobile: +39 328 010 7327 34012 TRIESTE TS - Italy AIM: stefanobelforte
— Reply to this email directly or view it on GitHubhttps://github.com/dmwm/CRAB2/issues/927#issuecomment-51940642.
crab2 uses lexicon.py already, I extended it validate cmsname
about black/white lists... currently old, se-name based blackWhiteListParser is used in a lot of places, rather then changing all that code, I will add a PNN based parser in there to the SE and CE parsers ad use it where needed for remoteglidein. And so long all other schedulers.
Rather: only care for PNN when reading dataset location, convert to PSN as soon as possible and then all black/white lists are applied on PSN. This may sort of change what override_data_location does. Indeed we want to override PSN, not PNN. But I do not like to change established options...
So let's avoid applying any b/w list at job splitting time. Only at submission time. will keep override_data_location as it is to override DataLocation result. for this open #1152
and make sure to filter out tape locations early ! for this open #1153
b/w list is done, for the others better to open new tickets
remaining work is now tracked in #1154 and #1155
Original Savannah ticket 102570 reported by None on Thu Sep 12 07:59:34 2013.
this seems the plan for CMS at this point. Possibly Crab3 does this already. WMA wants to go there. It will be the only way to e.g. do analysis at T1_IT_CNAF as long as they have some SE name for disk and not-disk data. But it will some work...
A few other features/fixes depend on this for a clean resolution.