hep-gc / cloud-scheduler

Automatically boot VMs for your HTC jobs
http://cloudscheduler.org
Apache License 2.0
3 stars 0 forks source link

Support Mime Multi Part Archives In User Data #349

Closed berghaus closed 9 years ago

berghaus commented 9 years ago

Please allow the specification of multiple files to send instructions to virtual machine congratulation, for example contextualization with cernvm init scripts as well as cloud-init.

Here is what I did for a proof of concept. Created the following two files: the first called cernvm-data.txt with the following content:

[ucernvm-begin]
CVMFS_PAC_URLS=http://shoal.heprc.uvic.ca/wpad.dat
[ucernvm-end]

The other was the IAAS.yaml from my cloud-init repository. Then I used the following helper_script,py to generate the mime-formatted data:

#!/usr/bin/python

import sys

from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText

if len(sys.argv) == 1:
    print("%s input-file:type ..." % (sys.argv[0]))
    sys.exit(1)

combined_message = MIMEMultipart()
for i in sys.argv[1:]:
    (filename, format_type) = i.split(":", 1)
    with open(filename) as fh:
        contents = fh.read()
    sub_message = MIMEText(contents, format_type, sys.getdefaultencoding())
    sub_message.add_header('Content-Disposition', 'attachment; filename="%s"' % (filename))
    combined_message.attach(sub_message)

print(combined_message)

and used it like this:

python helper-script.py docs/userdata/IAAS.yaml:cloud-config cernvm-data.txt:ucernvm-config > test-user-data

To generate a multi-part message looking something like this:

From nobody Tue Nov 25 16:34:50 2014
Content-Type: multipart/mixed; boundary="===============9118196963083328116=="
MIME-Version: 1.0

--===============9118196963083328116==
MIME-Version: 1.0
Content-Type: text/cloud-config; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="docs/userdata/IAAS.yaml"

write_files:
-   content: |
        #
        # A sample configuration file for shoal client.
        #
        [general]
        cvmfs_config = /etc/cvmfs/default.local
        shoal_server_url = http://shoal.heprc.uvic.ca/nearest
        default_squid_proxy = http://chrysaor.westgrid.ca:3128;http://cernvm-webfs.atlas-canada.ca:3128;DIRECT
    owner: root:root
    path: /etc/shoal/shoal_client.conf
    permissions: '0644'

--===============9118196963083328116==
MIME-Version: 1.0
Content-Type: text/ucernvm-config; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="cernvm-data.txt"

[ucernvm-begin]
CVMFS_PAC_URLS=http://shoal.heprc.uvic.ca/wpad.dat
[ucernvm-end]

--===============9118196963083328116==--

Please allow something similar by parsing the +VMAMIConfig for example like this:

+VMAMIConfig    = "/srv/userdata/IAAS.yaml:cloud-config, /srv/userdata/cernvm-data.txt:ucernvm-config"
mhpx commented 9 years ago

Ok can try this out - i'll probably update this to support http as well as local files since the current amiconfig has that feature. right now the extra files have to be local

berghaus commented 9 years ago

So I was ambitious and went strait to trying this JDL snippet:

+VMAMIConfig     = "/local/berghaus/test.yaml:cloud-config,/srv/userdata/ucernvm-data.txt:ucernvm-config"

The cloud scheduler cashed with this message:

Exception in thread Scheduler:
Traceback (most recent call last):
  File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.6/site-packages/cloud_scheduler-1.8-py2.6.egg/EGG-INFO/scripts/cloud_scheduler", line 342, in run
    self.scheduling_method()
  File "/usr/lib/python2.6/site-packages/cloud_scheduler-1.8-py2.6.egg/EGG-INFO/scripts/cloud_scheduler", line 561, in scheduler_fair_share
    if self.sched_resource_create_track(user, job):
  File "/usr/lib/python2.6/site-packages/cloud_scheduler-1.8-py2.6.egg/EGG-INFO/scripts/cloud_scheduler", line 642, in sched_resource_create_track
    create_ret = self.vm_creation(job, good_resources)
  File "/usr/lib/python2.6/site-packages/cloud_scheduler-1.8-py2.6.egg/EGG-INFO/scripts/cloud_scheduler", line 783, in vm_creation
    create_ret = resource.vm_create(**args)
  File "/usr/lib/python2.6/site-packages/cloud_scheduler-1.8-py2.6.egg/cloudscheduler/ec2cluster.py", line 243, in vm_create
    user_data = cloud_init_util.build_multi_mime_message([(user_data, 'cloud-config')], extra_userdata)
  File "/usr/lib/python2.6/site-packages/cloud_scheduler-1.8-py2.6.egg/cloudscheduler/cloud_init_util.py", line 87, in build_multi_mime_message
    sub_message.add_header('Content-Disposition', 'attachment; filename="%s"' % (i[2]))
IndexError: tuple index out of range

I calmed down a bit and did this instead:

+VMAMIConfig = "/local/berghaus/test.yaml:cloud-config"

Also tested that the old JDL still works as expected.

berghaus commented 9 years ago

While cloud scheduler handles a single user-data file with mime type fine here is what ends up in the user-data:

# curl 169.254.169.254/latest/user-data | zcat
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0    21    0    21    0     0   2009      0 --:--:-- --:--:-- --:--:--  2333
#

When the content should be the mime header plus the yaml for cloud-init not just "#" (it does actually contain the character # zipped, that isn't my next command line).

berghaus commented 9 years ago

Looks like there is still some problem, here is the crashlog:

Exception in thread Scheduler:
Traceback (most recent call last):
  File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.6/site-packages/cloud_scheduler-1.8-py2.6.egg/EGG-INFO/scripts/cloud_scheduler", line 342, in run
    self.scheduling_method()
  File "/usr/lib/python2.6/site-packages/cloud_scheduler-1.8-py2.6.egg/EGG-INFO/scripts/cloud_scheduler", line 561, in scheduler_fair_share
    if self.sched_resource_create_track(user, job):
  File "/usr/lib/python2.6/site-packages/cloud_scheduler-1.8-py2.6.egg/EGG-INFO/scripts/cloud_scheduler", line 642, in sched_resource_create_track
    create_ret = self.vm_creation(job, good_resources)
  File "/usr/lib/python2.6/site-packages/cloud_scheduler-1.8-py2.6.egg/EGG-INFO/scripts/cloud_scheduler", line 785, in vm_creation
    create_ret = resource.vm_create(**args)
  File "/usr/lib/python2.6/site-packages/cloud_scheduler-1.8-py2.6.egg/cloudscheduler/ec2cluster.py", line 243, in vm_create
    user_data = cloud_init_util.build_multi_mime_message([(user_data, 'cloud-config')], extra_userdata)
  File "/usr/lib/python2.6/site-packages/cloud_scheduler-1.8-py2.6.egg/cloudscheduler/cloud_init_util.py", line 89, in build_multi_mime_message
    sub_message.add_header('Content-Disposition', 'attachment; filename="%s"' % (i[2].strip()))
IndexError: tuple index out of range

and the tail ends of the cloud scheduler log is here:

2014-12-19 15:03:07,255 - INFO - Cleanup - Final(for real) num to change: {'apf:cern-worker': 62, 'apf:cern-mcore-worker': 16, 'apf:datacentred': -4}
2014-12-19 15:03:27,149 - ERROR - MainThread - Scheduler thread died!
2014-12-19 15:03:29,153 - ERROR - MainThread - Whoops. Wasn't expecting to exit. Did a thread crash?
2014-12-19 15:03:29,155 - INFO - MainThread - Cloud Scheduler quitting normally. (It might take a while, don't panic!)
2014-12-19 15:03:29,305 - INFO - Cleanup - Exiting cleanup thread
2014-12-19 15:03:29,499 - INFO - MachinePoller - Exiting machine polling thread
2014-12-19 15:03:30,140 - INFO - JobPoller - Exiting job polling thread
2014-12-19 15:03:30,915 - INFO - MainThread - Cloud Scheduler stopped. Bye!
rptaylor commented 9 years ago

What's the current status of this? Should we try it again?

mhpx commented 9 years ago

Think it's working now. Just need confirmation then can close issue.

rptaylor commented 9 years ago

I tried this again:

+VMAMIConfig = "/srv/userdata/IAAS.yaml:cloud-config,/srv/userdata/cernvm-data.txt:ucernvm-config"

However it didn't work:

2015-02-06 16:59:10,421 - ERROR - Scheduler - <type 'instance'> can't be encoded 2015-02-06 16:59:10,422 - DEBUG - Scheduler - Failed to create instance on cc-east 2015-02-06 16:59:10,422 - DEBUG - Scheduler - Creating VM for job condor.heprc.uvic.ca#238347.2#1423270202 failed on cc-east. 2015-02-06 16:59:10,422 - DEBUG - Scheduler - None of the resources could boot a vm for job condor.heprc.uvic.ca#238347.2#1423270202. Leaving apf-test's job unscheduled.

mhpx commented 9 years ago

updated cs on belle and condor with fix for this. If you could try it again.

rptaylor commented 9 years ago

I booted a VM which has the desired content in /var/lib/cloud/instance/user-data.txt :

--===============0119130212==
MIME-Version: 1.0
Content-Type: text/ucernvm-config; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="/srv/userdata/cernvm-data.txt"

[ucernvm-begin]
CVMFS_PAC_URLS=http://shoal.heprc.uvic.ca/wpad.dat
[ucernvm-end]

However the cloud-init logs had:

Feb 24 00:12:49 host-192-168-19-220 [CLOUDINIT] __init__.py[WARNING]: Unhandled unknown content-type (text/ucernvm-config) userdata: '[ucernvm-begin]\n CVMFS_PA...'
Feb 24 00:12:49 host-192-168-19-220 [CLOUDINIT] __init__.py[WARNING]: Unhandled unknown content-type (text/ucernvm-config) userdata: '[ucernvm-begin]\n CVMFS_PA...'

I need to be able to verify the status of the CernVM OS CVMFS repo.

mhpx commented 9 years ago

So the problem is in... cloud-init itself not recognizing the mime type? or is it an encoding error in cloud_scheduler?

rptaylor commented 9 years ago

This should help: https://github.com/hep-gc/cloudinit-userdata/commit/a0a744224018962618ccf35d40bb467701f2eadd It may be working now ...

rptaylor commented 9 years ago

Success! uCernVM is being contextualized correctly.

rptaylor commented 9 years ago

So this needs to be closed .... I can't close it.

igable commented 9 years ago

@rptaylor you had some missing permissions. Fixed now.