Closed andf1 closed 8 years ago
Hello! I look forward to hearing how your deployment turns out.
My expectation is that administrators will have to tweak the various commands for the particularities of their sites, although I would be open to adding this flag by default for the GridengineSpawner class. To adjust the spawn commands without maintaining a local code patch, you can make these adjustments in your configuration file like so:
c.GridengineSpawner.batch_submit_command` = 'sudo -i -E -u {username} qsub' # and so on for the others
However, this opens the possibility that something in a user's local config will break the notebook startup (e.g. we have users who unconditionally LD_PRELOAD strange things in their environment). So I would suggest instead adding these variables directly to the spawner configuration like so:
c.Spawner.environment = dict(SGE_ROOT='/gridware/uge', SGE_CELL='default', etc...)
Hi!
This was really helpful!
On your advice, I made the following additions to jupyterhub_config.py:
c.GridengineSpawner.batch_submit_cmd = 'sudo -i -E -u {username} qsub -q {queue}' c.GridengineSpawner.batch_query_cmd = 'sudo -i -E -u {username} qstat -xml' c.GridengineSpawner.batch_cancel_cmd = 'sudo -i -E -u {username} qdel {job_id}’
which resolved both of my issues (#9, #10). So doing, I was able to restore batchspawner to its direct-from-the-developer state. Brilliant!
I will test feeding the SGE variables directly via c.Spawner.environment in a bit, but for now this both:
I’ll report back again down the road with other interesting findings as we near the completion of our proof-of-concept, should you be interested.
Thanks again — really appreciate your taking the time to point me in the right direction!
Thanks for the kinds words! I would definitely be interested in hearing what you come up with at your site. As it happens, I'm going to be talking about my work on Jupyterhub at Scipy 2016 in Austin next week. Do you mind if I (briefly) mention you in my talk? I plan to include a list of the institutions that I know to be trying it out.
In the meantime, I'll mark both these issues as resolved.
Michael
I plan to include a list of the institutions that I know to be trying it out.
Hi Michael, sorry for chiming in, may be it would be interesting for you to know that batchspawner works nicely at http://www.gradient-geo.com/en on a GPU cluster (42nd place in Russian top50). The batch manager we use is SGE.
Sure thing, I've sent you an Email with details.
Using just:
c.Spawner.environment = dict(SGE_ROOT='/usr/share/gridengine', SGE_CELL='default')
Does not appear to work for me because this environment is not used for the query command (and possibly not for the cancel command). So it looks like the -i workaround is still necessary at this point.
Actually, I ended up abandoning using -i because that caused issues with sudo and ttys in my setup. Instead I put SGE_ROOT/CELL into /etc/environment.
Thanks for the feedback. Thinking about the likely use-cases here, it seems like Batchspawner probably should use the spawner environment when running the job management commands. It seems more likely to cause incorrect behaviour running these with different environment contexts.
I'll note that as something that needs doing.
On Mon, Sep 19, 2016 at 3:53 PM, Orion Poplawski notifications@github.com wrote:
Actually, I ended up abandoning using -i because that caused issues with sudo and ttys in my setup. Instead I put SGE_ROOT/CELL into /etc/environment.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/jupyterhub/batchspawner/issues/9#issuecomment-248123228, or mute the thread https://github.com/notifications/unsubscribe-auth/AKP4Z_MVB_6iCG1s5dUaeRDMGR8Qx7N1ks5qrvZlgaJpZM4JGd_F .
Michael Milligan, Ph.D. | Supercomputing Institute Scientific Computing Consultant | University of Minnesota Head of Application Development | milligan@umn.edu www.msi.umn.edu/staff/milligan | Phone: 612-624-8857
Hello sir --
Hope I'm doing this correctly. Thanks for putting this together!
I'm building a proof of concept using your batchspawner to allow JupyterHub to hand off kernels and user sessions to compute nodes via our Univa Grid Engine 8.2.1 grid. I've got it working well, but I had to make the following edits to do so:
In your batchspawner/batchspawner.py file:
This is because all user sessions as spun off via sudo require certain essential grid engine environment variables. For us, these are:
(Univa now maintains Sun Grid Engine, so Univa Grid Engine still uses SGE_* everywhere for env variables and the like)
Passing the -i flag to sudo allows the standard user login scripts (either /etc/profile or /etc/csh.cshrc, depending on shell) to fire on job creation, which in turn each source the SGE config script necessary to define SGE variables (in /gridware/uge/default/common/settings.[c]sh).
I'm sure this is not required for all deployments, but it was necessary for ours. I just wanted to pass this on to you in the event that you have other users with similar configs.
Happy to answer questions or assist in testing. Thanks again for this awesome module!
EDIT: diff output and markdown syntax do not get along.