nils-braun / b2luigi

Task scheduling and batch running for basf2 jobs made simple
GNU General Public License v3.0
17 stars 11 forks source link

`get_dirac_user()` fails due to empty `get_proxy_info()` #138

Closed meliache closed 3 years ago

meliache commented 3 years ago

Jake Bennet tried using b2luigi with gbasf2 for the systematic corrections framework and reported the error

Traceback (most recent call last):
File "/home/belle2/jbennett/.local/lib/python3.6/site-packages/b2luigi/batch/processes/gbasf2.py", line 947, in get_dirac_user
return get_proxy_info()["username"]
KeyError: 'username'

Seems like get_proxy_info() returns an empty dictionary, which happens when there is CalledProcessError. Not sure if that's because the proxy hasn't been initialized before get_dirac_user() is called or due to some other subprocess error. Gbasf2 and basf2 by themselves supposedly work for Jake he tried initializing the proxy in the terminal.

I would be super happy @philiptgrace, who contributed the code, could also help or comment.

One suggestion of me for making debugging easier in the feature is not to catch the CalledProcessError (returning {} instead) in the get_proxy_info() function. In cases where we want to ignore the error, it might be better to ignore it via try...except only in those places, e.g. in get_dirac_user it would have been useful to get the CalledProcessError.

I just asked Jake to try

from b2luigi.batch.processes.gbasf2 import get_proxy_info, setup_dirac_proxy
setup_dirac_proxy()
print(get_proxy_info())

and am currently waiting for feedback. He already tried the above the above without the setup_dirac_proxy() and got an empty dictionary then.

meliache commented 3 years ago

He now message me when calling setup_dirac_proxy, get hets

>>> setup_dirac_proxy()
Generating proxy...
Can't find user certificate and key
Error: Operation not permitted ( 1 : )
meliache commented 3 years ago

I found the cause. It has nothing to do with any code from @philiptgrace , sorry for mentioning you. I asked Jake to show me the output of get_gbasf2_env() and it contained

'HOME': '/ext/home/ueda'

b2luigi get the gbasf2 environment by sourcing the setup script from a bash with all environment variables initially unset and then printing the resulting set of environment variables, via the equivalent of the bash command

env -i bash -c 'source /cvmfs/belle.kek.jp/grid/gbasf2/pro/tools/setup > /dev/null && env'

Turns out that the setup script sets HOME to /ext/home/ueda if it's initally unset. On NAF this directory doesn't exist, so then it's ignored and we don't get problems, but on KEKCC, the gb2_proxy_init subprocess then tries to get the grid certificates from there but it fails due to missing access rights.

A quick fix is to add the following line after line 930 in gbasf2.py:

    gbasf2_env["HOME"] = os.getenv("HOME")  # use original home, which gbasf2 sometimes overwrites