dmwm / CRABServer

15 stars 38 forks source link

sumit --dryrun fails with http error 400 when submission to TW failed #4912

Open belforte opened 9 years ago

belforte commented 9 years ago

ERROR: RuntimeError: Reading https://cmsweb.cern.ch/crabcache/logfile?name=dry-run-sandbox.tar.gz failed with code 400

could a more descriptive message be printed for users ? possibly including the reason why crab submit failed.

see: https://hypernews.cern.ch/HyperNews/CMS/get/computing-tools/818.html

matz-e commented 8 years ago

@belforte, @mmascher: we can check the status of the last command from within the client and not do the dry-run when the command failed. As seen in the HN message, this would give you a failure mode. In the message, you can also see that the dry-run continues despite the submission failing.

Drawback: When the splitting fails, the dry-run would not run…

I'm trying to think of some other way to solve this that's not too involved…

belforte commented 8 years ago

the cleanest way would be to separate the FAILSUBMIT status into SPLITFAIL and SUBMITFAIL. @mmascher would it make sense to do this via some variable that the client can lookup to tell the two, w/o asking you to change more the new StateMachine ?

matz-e commented 8 years ago

As I've mentioned in #4915, it may actually be better to move the dry-run before the splitting, in which case this would be moot. Running the dry-run before then would be good too, IMHO, because I can construct some scenarios where the dry-run does not see enough events to give good estimates.

belforte commented 8 years ago

I agree that moving dry-run is better, but we still should separate (if nothing else for our monitoring and to help ops figure out what's going on) failed in splitting vs. failed to talk to schedd.