dmwm / das2go

Go implementation of Data Aggregation System (DAS) for CMS experiment
MIT License
2 stars 3 forks source link

missing reqmgr info and relval dataset configuration #35

Closed slava77 closed 2 years ago

slava77 commented 3 years ago

as an example for config dataset=/RelValTTbar_14TeV/CMSSW_12_1_0_pre4-PU_121X_mcRun3_2021_realistic_v10_HighStat-v2/MINIAODSIM I get https://cmsweb.cern.ch/das/request?view=list&limit=50&instance=prod%2Fglobal&input=config+dataset%3D%2FRelValTTbar_14TeV%2FCMSSW_12_1_0_pre4-PU_121X_mcRun3_2021_realistic_v10_HighStat-v2%2FMINIAODSIM

image

Under Config Name link there is a generic https://cmsweb.cern.ch/das/request?input=config%3DReqMgr2&instance=prod/global , I would expect that it should return the actual reqmgr request pdmvserv_RVCMSSW_12_1_0_pre4TTbar_14TeV__HighStat_211020_131422_5228 at https://cmsweb.cern.ch/reqmgr2/fetch?rid=pdmvserv_RVCMSSW_12_1_0_pre4TTbar_14TeV__HighStat_211020_131422_5228

The following part, at Config urls: correctly picks up only one config out of 4 available in the request, as can be seen in the reqmgr2 link above

Config Cache List

    DQMConfigCacheID: 46713bf726160ce248142d29719c1878
    Task1: DigiPU_2021PU: ConfigCacheID: 46713bf726160ce248142d29719b6f22
    Task2: RecoPU_2021PU: ConfigCacheID: 46713bf726160ce248142d29719be2df
    Task3: Nano_2021PU: ConfigCacheID: 46713bf726160ce248142d29719c2eff 
vkuznet commented 3 years ago

Slava, it is good observation but it is not trivial to solve. Let me explain. The actual query config dataset=XXX translates into two calls to reqmgr2 service:

# one for input dataset
https://cmsweb.cern.ch/reqmgr2/data/request?inputdataset=/RelValTTbar_14TeV/CMSSW_12_1_0_pre4-PU_121X_mcRun3_2021_realistic_v10_HighStat-v2/MINIAODSIM

# one for output dataset
https://cmsweb.cern.ch/reqmgr2/data/request?outputdataset=/RelValTTbar_14TeV/CMSSW_12_1_0_pre4-PU_121X_mcRun3_2021_realistic_v10_HighStat-v2/MINIAODSIM

DAS parses generic JSON, check the return of outputdataset call. Since JSON has no fixed schema, I was only told at some point to look at ConfigCacheID match in keys. Since there is not schema, i.e. all attributes are assigned based on dynamic nature of workflow and their keys are not static, then I can't apply generic algorithm to parse it. For instance, since I don't a-priori if Task key will appear in a document (and most likely they were added at later stage) I don't know where to look for configs. This is very serious issue in WMCore code who does not provide static schema for their documents and parsing docs become very complicated problem. I can extract the actual RequestName and construct appropriate link, but I have no idea if Task attributes will be present in a document, and how many document will have, and which naming convention they will use. So far tasks have Task1, Task2, etc. but no one told me that they exists (since there is no schema publicly available). I can add new code which will look for string matches of Task attributes using regexp, but it does not guarantee that I'll miss something when new config section will be added somewhere else in a document. I hope you understand the point.

Anyway, not once I actually see new structure of reqmgr output I can try to adjust code to parse it, but this structure may still evolve somehow and if no schema will be published we'll have similar discussion again in a future.

slava77 commented 3 years ago

I can extract the actual RequestName and construct appropriate link

this would be great to have.

I think that the second part requested in my issue description (about the full set of config links) is less essential. Still, would it make sense to look for ConfigCacheID in all elements and construct the result based on that?

vkuznet commented 3 years ago

Slave, please have a look at cmsweb-testbed, e.g.

https://cmsweb-testbed.cern.ch/das/request?view=list&limit=50&instance=int%2Fglobal&input=config+dataset%3D%2FRelValTTbar_14TeV%2FCMSSW_12_1_0_pre4-PU_121X_mcRun3_2021_realistic_v10_HighStat-v2%2FMINIAODSIM

Now, it returns proper request config name, it also provide a link to ReqMgr info. So far link points cmsweb-testbed you'll get 500, but link is auto-generate from a deployment cluster and it will be working fine on cmsweb (you may check that by removing -testbed from it).

Let me know. I plan to rename Config name to Request name since it better represent the value though. After some tests and your ok, I'll put it on production.

slava77 commented 3 years ago

the ReqMgr info link points to https://cmsweb-testbed.cern.ch/reqmgr2/fetch?rid=pdmvserv_RVCMSSW_12_1_0_pre4TTbar_14TeV__HighStat_211020_131422_5228 which does not exist. I'm not sure if this is just a specifics of using cmsweb-testbed; the link should have cmsweb.cern.ch.

The link for Config urls: output-config-0 points to cmsweb.cern.ch correctly.

Instead of (or in addition to?) the plaintext Request ids: 46713bf726160ce248142d29719c1878, 46713bf726160ce248142d29719c2eff, 46713bf726160ce248142d29719b6f22, 46713bf726160ce248142d29719be2df I'd rather see links to config urls like the already present https://cmsweb.cern.ch:8443/couchdb/reqmgr_config_cache/46713bf726160ce248142d29719c1878/configFile

vkuznet commented 3 years ago

Slava, I made necessary changes and deployed new version of production server. Now you can see results directly on cmsweb.cern.ch, see

https://cmsweb.cern.ch/das/request?view=list&limit=50&instance=prod%2Fglobal&input=config+dataset%3D%2FRelValTTbar_14TeV%2FCMSSW_12_1_0_pre4-PU_121X_mcRun3_2021_realistic_v10_HighStat-v2%2FMINIAODSIM

Please confirm that everything works.

slava77 commented 3 years ago

Valentin, thank you for the update. The links are working for me and the data is present.

I'd still be interested to see the cacheIDs decoded to more human readable strings. Looking at https://cmsweb.cern.ch/reqmgr2/data/request?outputdataset=/RelValTTbar_14TeV/CMSSW_12_1_0_pre4-PU_121X_mcRun3_2021_realistic_v10_HighStat-v2/MINIAODSIM I think that

vkuznet commented 3 years ago

Slava, as I wrote since there is no fixed schema I can't be sure which names to pick and apply. For instance, you provide output of outputdataset, but I also scan inputdataset. Does their schema are the same? I doubt it. The DQMConfigCacheID is part of a string I scan using regex to match ConfigCacheID. The question is then how it will be called if it will not be a DQM config? The structure of task parts of the dictionary seems follow some schema and I can extract TaskName, but once again I don't know if it is persistent across all config files for all different datasets we produce/consume. I think it is a general question to DMWM team, please discuss this further with @amaltaro , @todor-ivanov , and the rest of WMCore team.

Bottom line, until we'll have fixed and documented schema for all these docs I have no idea how to correctly right the code to extract unknown to me attributes.

slava77 commented 3 years ago

OK, fair enough, I hope that the situation with the schema is understood and a good readable choice of config names in the config links will be made.

Thank you for the updates already made.

vkuznet commented 2 years ago

@amaltaro this request requires your review, especially since I don't know if configuration files follows specific schema. Here is it very important that documents should have static schema, and it is equally important to know up-front how certain names are created and where they will appear in a document. DAS always has this issues with different systems who do not provide the static schema. Please review and provide your feedback how certain configuration should be look-up in configuration docs, and provide schema definitions for these configuration files.

amaltaro commented 2 years ago

I am not sure I understand what is requested here.

WMCore/ReqMgr2 does define a request schema, their data types, how they are supposed to be constructed, default values, etc. However, we support multiple workflow/spec types, and each of them have its own peculiarities (also with a schema defined and enforced). In other words, there are key/value pairs that you will only find in TaskChain, others will only be available in StepChains, and ReReco will also be different. Please have a look at this documentation for further information: https://github.com/dmwm/WMCore/wiki/Workflow-creation-and-assignment-definition#request-type-dependencies

ConfigCacheID or DQMConfigCacheID is meant to be a hash unique id, which points to a document in central CouchDB. Hence ReqMgr2 provides that id instead of the task/step to which it belongs to. It should be fairly easy to rename it in DAS to something like:

if RequestType == ReReco:
    rename config cache id to DataProcessing
elif RequestType == TaskChain:
    rename config cache id to the value in TaskName (note that you need to look at the right task dict)
elif RequestType == StepChain:
    rename config cache id to the value in StepName (note that you need to look at the right step dict)
vkuznet commented 2 years ago

@amaltaro thanks for providing this info, this is what is required. In this case, where I can find schema definitions for each individual workflows? Does diagram you showed represents all possible workflows in a system or there are others? Does naming conventions are fixed, i.e. use of CamelCase, like ConfigCacheID or DQMConfigCacheID. Where I can find all declared attributes of the schema? Does this area https://github.com/dmwm/WMCore/tree/master/doc/createSpecs represents all existing schema files? Once I have all answers and schema definitions I can proceed with implementation.

vkuznet commented 2 years ago

@slava77 , I deployed new version on production cluster which now shows IDs together with corresponding task/request name. It is shown like this:

Config urls: output-config-0 Request ids: 46713bf726160ce248142d29719be2df (RecoPU_2021PU), 46713bf726160ce248142d29719c2eff (Nano_2021PU), 46713bf726160ce248142d29719b6f22 (DigiPU_2021PU), 46713bf726160ce248142d29719c1878 (DQMConfigCacheID)

You may check your URL. Please confirm that now everything works and I can close the ticket.

slava77 commented 2 years ago

looking at https://cmsweb.cern.ch/das/request?view=list&limit=50&instance=prod%2Fglobal&input=config+dataset%3D%2FRelValTTbar_14TeV%2FCMSSW_12_1_0_pre4-PU_121X_mcRun3_2021_realistic_v10_HighStat-v2%2FMINIAODSIM

I see

image

This is nice.

however the URLs are apparently malformed "https://cmsweb.cern.ch/couchdb/reqmgr_config_cache/46713bf726160ce248142d29719be2df%20(RecoPU_2021PU)/configFile" should instead be "https://cmsweb.cern.ch/couchdb/reqmgr_config_cache/46713bf726160ce248142d29719be2df/configFile"

I would drop the hash from the hyperlink text and have just RecoPU_2021PU

Another thing, ReqMgr info link has a URL pointing to https://cmsweb-k8s-prod.cern.ch/reqmgr2 is this the standard now or just a test instance?

vkuznet commented 2 years ago

thanks for spotting the issue with link, I didn't check it explicitly. Now it is fixed and I removed extra hashes from the link name. Please check and report again. Regarding URL for ReqMgr info, internally we do run it now on k8s and link is correct but I need to double check where we generate to confirm if we should point it to cmsweb or cmsweb-k8s-prod. I'll do it later.

slava77 commented 2 years ago

@vkuznet Thank you for the update The updates look good.

One minor thing I noticed is that there is some duplicate information now, in the URL output-config-0 we have the same information as DQMConfigCacheID ; note also the difference with :8443 and without it in the two cases, respectively.

vkuznet commented 2 years ago

@slava77 , thanks for checking. I need to decide now if we need to show Config urls since we have Request ids info. I don't know yet if new info covers all output config urls. I need to check with different set of data. And, I'll fix port number too.

amaltaro commented 2 years ago

https://github.com/dmwm/WMCore/tree/master/doc/createSpecs

@vkuznet Hi Valentin, yes, this is the right place to see the request schema. It is likely missing the recent GPU* parameters though, I will have to update it in the coming days.

Does diagram you showed represents all possible workflows in a system or there are others?

yes, StoreResults is planned to be deprecated though. So it's up to you if you want to support old workflows or not (likely less than 10 such workflows every year).

Does naming conventions are fixed, i.e. use of CamelCase, like ConfigCacheID or DQMConfigCacheID.

yes, for spec attributes, we always use upper camel case.

vkuznet commented 2 years ago

@slava77 regarding cmsweb-k8s-prod link to ReqMgr. It is complicated and was introduced when DMWM team decided to separate cmsweb into two entities, one for end-users and another for production tools. Since DAS uses maps which has services URLs the direct links to services, like DBS or ReqMgr now point cmsweb-k8s-prod and internally DAS queries services via these URLs. For all other URs, like another DAS query, all links remain pointing to cmsweb. I don't want to write additional layer of redirection and current links are representing correct URLs, i.e. if it points to service on production cluster and if it uses DAS URL to access some query (which points to cmsweb).

I think this ticket can be closed, and I still need time to investigate if output-config-XXX links can be removed. Please confirm that we can close this ticket.

slava77 commented 2 years ago

Please confirm that we can close this ticket.

I'm fine to have this closed.