Kitware / HPCCloud

A Cloud/Web-Based Simulation Environment
https://kitware.github.io/HPCCloud/
Apache License 2.0
50 stars 23 forks source link

Command parsing broken by login message output #623

Open robertsawko opened 6 years ago

robertsawko commented 6 years ago

Hi,

Meanwhile we stumbled across another issue and I think this time it's definitely a (minor) problem of the current implementation.

Our clusters apart from MOTD produce messages from lmod. What we are seeing on the login is this

[MOTD content]
##########################################################################
    group: my_group_name 
    user: my_user_name
    project: my_project
    base path: my_base_path

    Thank you for using Xalt to monitor your applications

Unfortunately, because of this HPCCloud catches the output of these non MOTD output and raises all sorts of nasty exceptions. Here's an example:

cat /opt/hpccloud/cumulus/cumulus/tasks/job.py
...

output = conn.execute('pwd')
if len(output) < 1:
    raise Exception('Unable to fetch users home directory %s.' % output)

We have shrewdly expanded the exception message to see output and this is what we found:

Exception: Unable to fetch users home directory [u'/gpfs/fairthorpe/local/HCRI016/dre03/cxp90-dre03\n', u'\tgroup: dre03 \n', u'\tuser: cxp90-dre03 \n', u'\tproject: HCRI016 \n', u'\tbase path: /gpfs/panther/local/HCRI016/dre03/cxp90-dre03 \n', u'\tpath to CDS: /gpfs/cds/local/HCRI016/dre03/cxp90-dre03 \n', u'\tThank you for using Xalt to monitor your applications\n', u'\n', u'\tThank you for using Xalt to monitor your applications\n', u'\n', u'-------------------------------------------------------------------------------\n', u'There are messages associated with the following module(s):\n', u'-------------------------------------------------------------------------------\n', u'\n', u'use.panther:\n', u'   Modules for software available on Panther compute cluster. Default stack\n', u'   on this system is IBM XL with Spectrum MPI. We are experimenting with\n', u'   software monitoring using Xalt.\n', u'\n', u'-------------------------------------------------------------------------------\n', u'\n'].

So it seems to me that HPCCloud is assuming no output on the login. This would be my preference, but I think this is a minor inconvenience as sometimes users may want to have some hello message printed or as the case of one of our clusters an lmod message.

Please advise.

cjh1 commented 6 years ago

@robertsawko Yes, HPCCloud is assuming no output on the login. For system that produced output on login, it might be possible filter out this output. When are spinning up clusters in the cloud we have control over the login environment, so that is problem why we are making this assumption.