dmwm / CRABClient

runrange
14 stars 36 forks source link

CRAB Client does not work on SL6 #4054

Closed bbockelm closed 10 years ago

bbockelm commented 10 years ago

The new voms-proxy-init causes issues with the CRAB3 client on SL6. I believe @belforte ran into the issue with CRAB2 and may know the solution.

belforte commented 10 years ago

Which issues exactly ? If it is the nssutil tmessage, t is not working when people use CMSSW_6_x with a non production scram arch. THe latest scram arch for CMSSW_7_x is OK instead.

bbockelm commented 10 years ago

This was with 7_x; voms-proxy-init hung indefinitely when it was invoked by CRAB3.

There also appears to be an issue with voms-proxy-info no longer accepting user certs. MyProxy delegation currently tries to do this.

Sent from my iPhone

On May 4, 2014, at 3:44 PM, Stefano Belforte notifications@github.com wrote:

Which issues exactly ? If it is the nssutil tmessage, t is not working when people use CMSSW_6_x with a non production scram arch. THe latest scram arch for CMSSW_7_x is OK instead.

— Reply to this email directly or view it on GitHub.

belforte commented 10 years ago

It is new to me then. I can look at it If given a way to riproduce it

Sent from Stefano's phone

----- Reply message ----- From: "Brian Bockelman" notifications@github.com To: "dmwm/CRABClient" CRABClient@noreply.github.com Cc: "Stefano Belforte" stefano.belforte@cern.ch Subject: [CRABClient] CRAB Client does not work on SL6 (#4054) Date: Mon, May 5, 2014 01:40 This was with 7_x; voms-proxy-init hung indefinitely when it was invoked by CRAB3.

There also appears to be an issue with voms-proxy-info no longer accepting user certs. MyProxy delegation currently tries to do this.

Sent from my iPhone

On May 4, 2014, at 3:44 PM, Stefano Belforte notifications@github.com wrote:

Which issues exactly ? If it is the nssutil tmessage, t is not working when people use CMSSW_6_x with a non production scram arch. THe latest scram arch for CMSSW_7_x is OK instead.

Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub.

mmascher commented 10 years ago

I modified getUSerCertEnddate to use openssl and not voms-proxy-info. See https://github.com/dmwm/WMCore/pull/5121

I could not reproduce the error of vomps-proxy-init hanging indefinetly, I will try again tomorrow.

As a reference this are old issues about this subject: https://github.com/dmwm/CRABServer/issues/4197 https://github.com/dmwm/WMCore/issues/4921 https://github.com/dmwm/WMCore/issues/4924

mmascher commented 10 years ago

Another user reported that voms-proxy-init was hanging indefinetly with CMSSW_7_x. See https://hypernews.cern.ch/HyperNews/CMS/get/crabFeedback/7527.html

I am investigating, it seems to be related to to the SLC6 java version of voms-proxy-init in general. I reproduced this also with CMSSW_5_3_4 for example. That's what CRAB3 is doing:

import subprocess, os, time

proc = subprocess.Popen(
            'voms-proxy-init -voms cms:/cms -valid 24:00 -rfc', shell=True, cwd=os.environ['PWD'],
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            stdin=subprocess.PIPE,
)

while True:
    if proc.poll() is not None:
        break
    time.sleep(1)
    print "tick"

stdout, stderr = proc.communicate()
rc = proc.returncode

print rc, stdout, stderr

In particular it seems voms-proxy-init is hanging there waiting for the password, and for some reason the output is not shown to the scree.

belforte commented 10 years ago

it is not impossible that way of reading password has changed in the move from C to Java and is not correctly handled by python redirections or whatever.

But I am still of the same opinion as I was 10 years ago: Crab should check that the proxy is good, and if not complain and bump it back to the user to do the voms-proxy-init thing. In case we can print the command to execute for easy copy/pasting, but would make it clear and clean whether problem are in crab or voms and make support life easier.

If Crab can't create my proxy (like now), I bug crab developers. If voms-proxy-* fails, I bug someone else. We could even give users a script to create the proxy, to run everyday after login. But it must be somthing different from "crab". That alone makes A LOT of difference IMO.

besides I never liked tools that try to do everything

mmascher commented 10 years ago

Yes, I agree, I think that the C version of voms-proxy-init was doing something fancy with stdout and stdin. Indeed if I redirect the stout and the stderr to a file something is still printed on the screen (and that's what CRAB3 is showing):

[lxplus411] /afs/cern.ch/user/m/mmascher/wf > voms-proxy-init > out 2> err
Enter GRID pass phrase: [I typed my pwd]
[lxplus411] /afs/cern.ch/user/m/mmascher/wf > cat out err
Your identity: /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=mmascher/CN=720897/CN=Marco Mascheroni
Creating proxy  Done
Your proxy is valid until Fri May 23 08:10:18 2014

..........................................................................................................................................................

That's why the python redirections worked, even though stdout, stderr, and stdin were supposed to be handled by the python code with communicate. In other words the current code is saying ignore the std*, I will provide you them (via communicate). I am not sure if we were just lucky that it worked until now, or if it was supposed to be this way.

Anyway, the java version is doing what I would expect from any command, not printing anything if I redirect stdout and err:

[mmascher@lxplus0138 wf]$ voms-proxy-init > out 2> err
[I typed my pwd (echoed on the screen)]
[mmascher@lxplus0138 wf]$ cat out err
Enter GRID pass phrase for this identity:
Created proxy in /tmp/x509up_u8440.

Your proxy is valid until Fri May 23 08:16:12 CEST 2014

BTW, how's CRAB2 calling voms-proxy-init ? Can you point me to the code Stefano please?

About the question "just check if the proxy is good and return if it's note the case". In general I agree with you, it would make life easier for us. And that's a general rule, more work for the user means less work for developers/operators. We need a tradeoff, and in this situation my tradeoff is that it is probably easier to solve this bug than rewrite the code in such a way it just checks for the proxy :)

Anyway, that's the patch: https://github.com/dmwm/WMCore/pull/5149 Let's open another ticket if we really want to let the user run the commands (I have no strong opinion about that)

mmascher commented 10 years ago

Ah, the script I posted in the previous message is an extract from the crab3 code that it can be executed standalone if you really want to :)

belforte commented 10 years ago

indeed I always wondered how Crab2 could use voms-proxy-init like that ... thanks for insight. I can't say if it was skill or luck that made it work initially. Anyhow the code is here: https://github.com/dmwm/ProdCommon/blob/master/src/python/ProdCommon/Credential/Proxy.py#L197

I agree that what you did was the fastest now. OTOH there is still time to improve for the future. In the end I think we lost more then we gained in hiding proxy creation under our hood.

mmascher commented 10 years ago

Ok, I see, CRAB2 is using os.system for commands that interacts with the users, like voms-proxy-init and myproxy-init (the command inherits the parent's std*), and it's using Popen and select for other commands where the stdout needs to be parsed, see https://github.com/dmwm/ProdCommon/blob/master/src/python/ProdCommon/BossLite/Common/System.py#L35.

With the patch I provided we will do the same (with different and newer API from python)

Closing the issue as I think everything is figured out.