dmwm / cmssh

Interactive shell for CMS experiment
http://cms.cern.ch/
7 stars 2 forks source link

Protect cmssh loader from possible SiteDB outage (originally: does not load) #30

Open jpata opened 12 years ago

jpata commented 12 years ago

I'm getting the following error upon loading cmssh:

Loading external/libtiff ... DONE Loading external/libungif ... DONE

cmssh+pylab Python environment [backend: Agg].

[TerminalIPythonApp] Error in loading extension: cmssh_extension

Check your config files in /home/joosep/cmssh/soft/.ipython/profile_cmssh

BadStatusLine Traceback (most recent call last) /home/joosep/cmssh/soft/install/lib/python2.6/site-packages/IPython/core/extensions.pyc in load_extension(self, module_str) 88 if module_str not in sys.modules: 89 with prepended_to_syspath(self.ipython_extension_dir): ---> 90 import(module_str) 91 mod = sys.modules[module_str] 92 return self._call_load_ipython_extension(mod)

/home/joosep/cmssh/soft/.ipython/extensions/cmssh_extension.py in () 25 from cmssh.iprint import PrintManager, print_error, print_warning, print_info 26 from cmssh.debug import DebugManager ---> 27 from cmssh.cms_cmds import dbs_instance, Magic, cms_find, cms_du 28 from cmssh.cms_cmds import cms_ls, cms_cp, verbose, cms_dqueue, cmscrab 29 from cmssh.cms_cmds import cms_rm, cms_rmdir, cms_mkdir, cms_root, cms_xrdcp

/home/joosep/cmssh/soft/cmssh/src/cmssh/cms_cmds.py in () 21 from cmssh.iprint import msg_red, msg_green, msg_blue 22 from cmssh.iprint import print_warning, print_error, print_status, print_info ---> 23 from cmssh.filemover import copy_lfn, rm_lfn, mkdir, rmdir, list_se, dqueue 24 from cmssh.utils import list_results, check_os, unsupported_linux 25 from cmssh.utils import osparameters, check_voms_proxy, run, user_input

/home/joosep/cmssh/soft/cmssh/src/cmssh/filemover.py in () 28 from cmssh.utils import PrintProgress, qlxml_parser 29 from cmssh.url_utils import get_data ---> 30 from cmssh.sitedb import SITEDBMGR 31 32 def get_dbs_se(lfn):

/home/joosep/cmssh/soft/cmssh/src/cmssh/sitedb.py in () 74 75 # Singleton ---> 76 SITEDBMGR = SiteDBManager()

/home/joosep/cmssh/soft/cmssh/src/cmssh/sitedb.py in init(self, url, threshold) 47 self.timestamp = time.time() 48 self.threshold = threshold # in sec, default 3 hours ---> 49 self.init() 50 51 def init(self):

/home/joosep/cmssh/soft/cmssh/src/cmssh/sitedb.py in init(self) 54 url = self.url + '/site-names' 55 names = {} ---> 56 with get_data_and_close(url) as data: 57 for row in parser(data.read()): 58 names[row['site_name']] = row['alias']

/opt/software/cms/slc5_amd64_gcc462/external/python/2.6.4/lib/python2.6/contextlib.pyc in enter(self) 14 def enter(self): 15 try: ---> 16 return self.gen.next() 17 except StopIteration: 18 raise RuntimeError("generator didn't yield")

/home/joosep/cmssh/soft/cmssh/src/cmssh/url_utils.pyc in get_data_and_close(url, headers) 161 opener = urllib2.build_opener(handler) 162 urllib2.install_opener(opener) --> 163 data = urllib2.urlopen(req) 164 try: 165 yield data

/opt/software/cms/slc5_amd64_gcc462/external/python/2.6.4/lib/python2.6/urllib2.pyc in urlopen(url, data, timeout) 122 if _opener is None: 123 _opener = build_opener() --> 124 return _opener.open(url, data, timeout) 125 126 def install_opener(opener):

/opt/software/cms/slc5_amd64_gcc462/external/python/2.6.4/lib/python2.6/urllib2.pyc in open(self, fullurl, data, timeout) 387 req = meth(req) 388 --> 389 response = self._open(req, data) 390 391 # post-process response

/opt/software/cms/slc5_amd64_gcc462/external/python/2.6.4/lib/python2.6/urllib2.pyc in _open(self, req, data) 405 protocol = req.get_type() 406 result = self._call_chain(self.handle_open, protocol, protocol + --> 407 '_open', req) 408 if result: 409 return result

/opt/software/cms/slc5_amd64_gcc462/external/python/2.6.4/lib/python2.6/urllib2.pyc in _call_chain(self, chain, kind, meth_name, _args) 365 func = getattr(handler, meth_name) 366 --> 367 result = func(_args) 368 if result is not None: 369 return result

/home/joosep/cmssh/soft/cmssh/src/cmssh/url_utils.pyc in https_open(self, req) 41 # a reference to a function which, for all intents and purposes, 42 # will behave as a constructor ---> 43 return self.do_open(self.get_connection, req) 44 45 def get_connection(self, host, timeout=300):

/opt/software/cms/slc5_amd64_gcc462/external/python/2.6.4/lib/python2.6/urllib2.pyc in do_open(self, http_class, req) 1117 try: 1118 h.request(req.get_method(), req.get_selector(), req.data, headers) -> 1119 r = h.getresponse() 1120 except socket.error, err: # XXX what error? 1121 raise URLError(err)

/opt/software/cms/slc5_amd64_gcc462/external/python/2.6.4/lib/python2.6/httplib.pyc in getresponse(self) 972 method=self._method) 973 --> 974 response.begin() 975 assert response.will_close != _UNKNOWN 976 self.__state = _CS_IDLE

/opt/software/cms/slc5_amd64_gcc462/external/python/2.6.4/lib/python2.6/httplib.pyc in begin(self) 389 # read until we get a non-100 response 390 while True: --> 391 version, status, reason = self._read_status() 392 if status != CONTINUE: 393 break

/opt/software/cms/slc5_amd64_gcc462/external/python/2.6.4/lib/python2.6/httplib.pyc in _read_status(self) 353 # Presumably, the server closed the connection before 354 # sending a valid response. --> 355 raise BadStatusLine(line) 356 try: 357 [version, status, reason] = line.split(None, 2)

BadStatusLine:

vkuznet commented 12 years ago

Could you please provide more details. I need to know your OS (and its version, e.g. Mac OSX Lion, Scientific Linux 5.2). Output of your "uname -a" command (run it from your normal shell). And I want to ask you to retry couple of more times. From the traceback I can conclude that it was a problem with connection to CMS SiteDB. I'll work on a code and try to protect against SiteDB outages. Thanks, Valentin.

jpata commented 12 years ago

Thanks for the reply. The output of uname is: Linux phys.hep.kbfi.ee 2.6.18-308.11.1.el5 #1 SMP Tue Jul 10 12:43:34 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux we're running SLC5. The problem seems to have disappeared after some time and a reinstall, perhaps it was an intermittent thing. I wanted to add that the tool is really quite useful, thanks for developing it.

Regards, Joosep

On Aug 10, 2012, at 5:15 PM, Valentin Kuznetsov notifications@github.com wrote:

Could you please provide more details. I need to know your OS (and its version, e.g. Mac OSX Lion, Scientific Linux 5.2). Output of your "uname -a" command (run it from your normal shell). And I want to ask you to retry couple of more times. From the traceback I can conclude that it was a problem with connection to CMS SiteDB. I'll work on a code and try to protect against SiteDB outages. Thanks, Valentin.

— Reply to this email directly or view it on GitHub.

vkuznet commented 12 years ago

Joosep, thanks for update. I'm glad you like it. But I'll work on a code to make protection against SiteDB outage. It was nice that you submit a ticket which shown this problem. I'll try to fix it anyway to prevent it from possible outages. Therefore I'll rename the ticket and keep it alive until code will be in place. Valentin.

vkuznet commented 12 years ago

As shown in original traceback, it is possible that SiteDB can hick-up at start-up of cmssh. To handle this case I either need to inform end-user or introduce thread monitor which will re-connect to SiteDB once its back. Investigate both solutions.