Closed belforte closed 2 years ago
To be clear: I have submitted requests to migrate 7 blocks in that AOD dataset, one at a time. Eventually those blocks got all migrated but in a few cases I got back the Id for the parent RAW block. Most cases where OK.
Stefano, I think it is an issue with your codebase. In another ticket you provided me your code and here is what you do:
if isinstance(result, list):
result=result[0]
status = result['migration_details']['migration_status']
In other words, you pickup first item from the return list which (since we migrate first parents) returns you a RAW dataset migration requests.
When you submit a request which in turns needs to migrate other datasets then you will get list of all migration requests which will include your dataset and all its parents. For instance, if I use curl for your dataset here is an actual output:
[
{
"migration_details": {
"migration_request_id": 1,
"migration_url": "https://cmsweb.cern.ch/dbs/prod/global/DBSReader",
"migration_input": "/PAMinimumBias1/PARun2016C-v1/RAW#78fd15b2-b227-11e6-91fa-001e67abf228",
"migration_status": 0,
"create_by": "DBS-workflow",
"creation_date": 1658236605,
"last_modified_by": "DBS-workflow",
"last_modification_date": 1658236605,
"retry_count": 0
},
"migration_report": "Migration request is started",
"status": "PENDING",
"error": null
},
{
"migration_details": {
"migration_request_id": 2,
"migration_url": "https://cmsweb.cern.ch/dbs/prod/global/DBSReader",
"migration_input": "/PAMinimumBias1/PARun2016C-v1/RAW#ea0d89b4-b2d4-11e6-91fa-001e67abf228",
"migration_status": 0,
"create_by": "DBS-workflow",
"creation_date": 1658236605,
"last_modified_by": "DBS-workflow",
"last_modification_date": 1658236605,
"retry_count": 0
},
"migration_report": "Migration request is started",
"status": "PENDING",
"error": null
},
{
"migration_details": {
"migration_request_id": 3,
"migration_url": "https://cmsweb.cern.ch/dbs/prod/global/DBSReader",
"migration_input": "/PAMinimumBias1/PARun2016C-v1/RAW#54690bec-b2f5-11e6-91fa-001e67abf228",
"migration_status": 0,
"create_by": "DBS-workflow",
"creation_date": 1658236605,
"last_modified_by": "DBS-workflow",
"last_modification_date": 1658236605,
"retry_count": 0
},
"migration_report": "Migration request is started",
"status": "PENDING",
"error": null
},
{
"migration_details": {
"migration_request_id": 4,
"migration_url": "https://cmsweb.cern.ch/dbs/prod/global/DBSReader",
"migration_input": "/PAMinimumBias1/PARun2016C-v1/RAW#650f2d5e-b30f-11e6-91fa-001e67abf228",
"migration_status": 0,
"create_by": "DBS-workflow",
"creation_date": 1658236605,
"last_modified_by": "DBS-workflow",
"last_modification_date": 1658236605,
"retry_count": 0
},
"migration_report": "Migration request is started",
"status": "PENDING",
"error": null
},
{
"migration_details": {
"migration_request_id": 5,
"migration_url": "https://cmsweb.cern.ch/dbs/prod/global/DBSReader",
"migration_input": "/PAMinimumBias1/PARun2016C-PromptReco-v1/AOD#ae9a22f0-b51a-11e6-91fa-001e67abf228",
"migration_status": 0,
"create_by": "DBS-workflow",
"creation_date": 1658236605,
"last_modified_by": "DBS-workflow",
"last_modification_date": 1658236605,
"retry_count": 0
},
"migration_report": "Migration request is started",
"status": "PENDING",
"error": null
}
]
Please double check your code and close this ticket since I don't see anything wrong with DBS server, and yes, the first item in return list is RAW block migration request.
I beg to disagree. I am aware of the fact that the server can return a long list here (and I do not like that, but that's another discussion which we still need to have), but in this case I verified that the list was one element long. As I noted, simpy repeating same call on different blocks of the dataset at times I got one result at times another.
That said, it was late night and mistakes get made. I will double check. But I also think that you should use cmsweb-testbed too in your testing. Behavior could be different if nothing else because testbed is still also getting some migration load from CRAB.
BTW.. yeha.. this case aside, I was naively expecting the first element in the list to be the block I asked for. I "discovered" that the output can be a very long list, but this is a big change from previous python-based server which you never mentioned and never documented ! I do not like having to look for the migrationId to pick among all that noise. Why do you tell me ? Why should I care ? I ask migration server to migrate one block, I only care to know when that block is available. If you need to migrate tons of parents, it is your problem, you need to track that in migration server side (that's why we have a migration server) and just tell me when all is done. Do you expect my client to track all those migrations and make sure they all completed and/or issue retries etc. etc. ? Sometimes DB's become unresponsive and things go bad, there will be errors, and retries will be needed, who should do them ? If you say "CRAB", it means CRAB needs to run its own migration server, at that point I may very well lookup ancestors myself and do one parentage layer at a time so I have control of the process, or I may very well decide that this is too much to add to my load and CRAB will simply publish w/o parentage information.
Stefano, I try to address both sides. On a server we need to know (when we debug its issues) what are the entire chain of migration is, while on client we care about initial request. But if migration fail you need both info to debug the issue. As such I think what is missing is change for status
API to provide status about your concrete input. So far the status
API only accepts migration request id. Since single migration request (e.g. a dataset) can cause chain of migration requests on a server, and moreover a dataset migration always resolve into migration of individual blocks, we can't use id as identified of migration. What I think is missing is the check by given migration input. I'll be happy to provide that. Here is how I envision status
API should behave:
Please review and provide me your feedback and I'll adjust the code accordingly.
I checked the code and in fact everything is in place. The status API can do the following:
blk=/GenericTTbar/HC-CMSSW_9_2_6_91X_mcRun1_realistic_v2-v2/GEN-SIM-RAW%23eee377dc-76e2-11e7-a0c8-02163e00d7b3
curl "http://localhost:9898/dbs2go-migrate/status?migration_input=$blk"
[
{"create_by":"DBS-workflow","creation_date":1658242505,"last_modification_date":1658242505,"last_modified_by":"DBS-workflow","migration_input":"/GenericTTbar/HC-CMSSW_9_2_6_91X_mcRun1_realistic_v2-v2/GEN-SIM-RAW#eee377dc-76e2-11e7-a0c8-02163e00d7b3","migration_request_id":7,"migration_status":2,"migration_url":"https://cmsweb-testbed.cern.ch/dbs2go","retry_count":0}
]
I think I only need to update documentation to add last use-case.
So, can you use /status?migration_input=<your_input>
API call to check status of concrete block/dataset? This will be your initial migration input and it will not include all intermediate parents.
I updated documentation examples to show how to use status API with migration_input, e.g. /status?migration_input=<>
OK I can give it a try. Thanks. That should avoid the need for extensive code changes on my side. SOrry for being slow on following up on this. I really need to sped a lot of time with my family atm.
migration_input
or block_name
?
The DBSClient API requires the latter[1] (compatibility with python server, I guess) but it looks that when it passes it to server I get an error about "BLOCK_NAME": invalid identifier
[2]
[2] DBS Server error: [{'error': {'reason': 'DBSError Code:103 Description:DBS DB query error, e.g. mailformed SQL statement Function:dbs.executeAll Message: Error: ORA-00904: "MR"."BLOCK_NAME": invalid identifier\n', 'message': '', 'function': 'dbs.migrate.StatusMigration', 'code': 103}, 'http': {'method': 'GET', 'code': 400, 'timestamp': '2022-07-19 20:44:26.436756879 +0000 UTC m=+469619.048281327', 'path': '/dbs/int/phys03/DBSMigrate/status?block_name=/PAMinimumBias1/PARun2016C-PromptReco-v1/AOD#e680ed20-b30d-11e6-91fa-001e67abf228', 'user_agent': 'DBSClient/Unknown/', 'x_forwarded_host': 'cmsweb-testbed.cern.ch:8443', 'x_forwarded_for': '188.185.15.220', 'remote_addr': '188.185.121.219:44971'}, 'exception': 400, 'type': 'HTTPError', 'message': 'DBSError Code:103 Description:DBS DB query error, e.g. mailformed SQL statement Function:dbs.migrate.StatusMigration Message: Error: nested DBSError Code:103 Description:DBS DB query error, e.g. mailformed SQL statement Function:dbs.executeAll Message: Error: ORA-00904: "MR"."BLOCK_NAME": invalid identifier\n'}]
I was testing migration_input, and it works. But I'll check later block_name too. I think migration_input is better since it can be used for block and datasets values though.
i think migration_input works if sent via curl but python DBClient only accepts block_name now. I agree that being more general helps, but if a change in DBSClient is not required I can proceed more rapidly, since for DBS python client I need a new spec file in common COMP build: https://github.com/cms-sw/cmsdist/blob/comp_gcc630/py3-dbs3-client.spec
I'll make block_name working tomorrow. But in order to deploy new server I need schema change since I added parallelization and I'm awaiting this task to be done from DBA, therefore deployment of new DBS server is pending until schema will be altered.
no problem, I have a couple other pressing things to do
Stefano, I fixed status API to use block_name argument and deploy new version to testbed, e.g. you can do now https://cmsweb-testbed.cern.ch/..../status?block_name=<block>
. We also clarified the current output which returns all intermediate parents for migration input, and final element in this list represent your initial migration input. Please let me know if anything needs to be done here, otherwise let's close this issue.
I also updated apis.md and MigrationServer.md documentation about status parameters, the former lists what is allowed, the later provide concrete examples.
thanks ! I still have to write new code, sorry. Am a snail trying to keep up with your cheetah speed. But let\s close. If I find problems, I will open new issues. I need to write a paper review tomorrow...
I submitted a migration request for one block of an AOD dataset I wanted to test a dataset high in the chain with liimited parentage (RAW only).
So I asked for: {'migration_url': 'https://cmsweb.cern.ch/dbs/prod/global/DBSReader', 'migration_input': '/PAMinimumBias1/PARun2016C-PromptReco-v1/AOD#ae9a22f0-b51a-11e6-91fa-001e67abf228'}
and got the following result which refers to at block in the RAW dataset !!