canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.88k stars 857 forks source link

jinja rendering broken in latest git checkout #3836

Closed ubuntu-server-builder closed 1 year ago

ubuntu-server-builder commented 1 year ago

This bug was originally filed in Launchpad as LP: #1914641

Launchpad details
affected_projects = []
assignee = None
assignee_name = None
date_closed = 2021-05-24T04:17:24.045073+00:00
date_created = 2021-02-04T18:15:47.506838+00:00
date_fix_committed = None
date_fix_released = None
id = 1914641
importance = undecided
is_complete = True
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1914641
milestone = None
owner = andrewbogott
owner_name = Andrew Bogott
private = False
status = expired
submitter = andrewbogott
submitter_name = Andrew Bogott
tags = []
duplicates = []

Launchpad user Andrew Bogott(andrewbogott) wrote on 2021-02-04T18:15:47.506838+00:00

I use jinja templating for vendor data; it works with my .deb packaged version of cloud-init, 20.2-2~deb10u1

Testing with the latest git checkout, I see a json parser chocking on curly braces. That suggests that it's skipping the jinja rendering step, or trying to run it after json parsing, which won't work.

Here is the top part of my vendor data:

root@cloudinit-test:~# curl http://169.254.169.254/openstack/latest/vendor_data.json
{"domain": "codfw1dev.wikimedia.cloud", "cloud-init": "MIME-Version: 1.0\nContent-Type: multipart/mixed; boundary=\"XXXXboundary text\"\n\nThis is a multipart config in MIME format.\nIt contains a cloud-init config followed by\na first boot script.\n\n--XXXXboundary text\nMIME-Version: 1.0\nContent-Type: text/cloud-config; charset=\"us-ascii\"\n\n## template: jinja\n#cloud-config\n\nhostname: {{ds.meta_data.name}}\nfqdn: {{ds.meta_data.name}}.{{ds.meta_data.project_id}}.codfw1dev.wikimedia.cloud\n\n\n# /etc/block-ldap-key-lookup:\n#   Prevent non-root logins while the VM is being setup\n#   The ssh-key-ldap-lookup script rejects non-root user logins if this file\n#   is present.\n#\n# /etc/rsyslog.d/60-puppet.conf:\n#   Enable console logging for puppet\n#\n# /etc/systemd/system/serial-getty@ttyS0.service.d/override.conf:\n#   Enable root console on serial0\n#   (cloud-init will create any needed parent dirs)\nwrite_files:\n    - content: \"VM is work in progress\"\n      path: /etc/block-ldap-key-lookup\n    - content: \"daemon.* |/dev/console\"\n      path: /etc/rsyslog.d/60-puppet.conf\n    - content: |\n        [Service]\n        ExecStart=\n        ExecStart=-/sbin/agetty --autologin root --noclear %I $TERM\n      path: /etc/systemd/system/serial-getty@ttyS0.service.d/override.conf\n\n# resetting ttys0 so root is logged in\nruncmd:\n    - [systemctl, enable, serial-getty@ttyS0.service]\n    - [systemctl, restart, serial-getty@ttyS0.service]\n\n\nmanage_etc_hosts: true\n\npackages:\n    - gpg\n    - curl\n    - nscd\n    - lvm2\n    - parted\n    - puppet\n\ngrowpart:\n    mode: false\n\n# You'll see that we're setting apt_preserve_sources_list twice here.  That's\n#  because there's a bug in cloud-init where it tries to reconcile the\n#  two settings and if they're different the stage fails. That means that\n#  if one of them is set differently from the default (True) then nothing\n#  works.\napt_preserve_sources_list: False\napt:\n    preserve_sources_list: False\n 

And here are the errors:

2021-02-04 18:08:43,117 - util.py[WARNING]: Failed loading yaml blob. Invalid format at line 4 column 1: "while parsing a block mapping
  in "<unicode string>", line 4, column 1:
    hostname: {{ds.meta_data.name}}
    ^
expected <block end>, but found '<scalar>'
  in "<unicode string>", line 5, column 28:
    fqdn: {{ds.meta_data.name}}.{{ds.meta_data.project_id}}.cod ... 
                               ^"
2021-02-04 18:08:43,131 - util.py[WARNING]: Failed loading yaml blob. Invalid format at line 4 column 1: "while parsing a block mapping
  in "<unicode string>", line 4, column 1:
    hostname: {{ds.meta_data.name}}
    ^
expected <block end>, but found '<scalar>'
  in "<unicode string>", line 5, column 28:
    fqdn: {{ds.meta_data.name}}.{{ds.meta_data.project_id}}.cod ... 
                               ^"
2021-02-04 18:08:43,131 - util.py[WARNING]: Failed at merging in cloud config part from part-001
ubuntu-server-builder commented 1 year ago

Launchpad user Paride Legovini(paride) wrote on 2021-02-05T13:47:41.883894+00:00

Hello Andrew and thanks for this bug report. It is not easy for us to debug on the moving target like the latest git commit. Could you please git checkout 20.4.1 and try again? That's the git tag of the latest released version of cloud-init, which was extensively tested before being released. Hopefully it's new enough to ship the feature/bugfix you're missing from 20.2.

If the problem still happens with 20.4.1, could you please try to git checkout 20.2 and try to reproduce the problem there? If you still hit the failure then I'd suspect there's a local configuration/setup issue, as you reported the Debian packaged version of cloud-init 20.2 works fine.

If you can confirm this really looks like a bug in 20.4.1, please run

cloud-init collect-logs

right after hitting the issue, and attach the resulting tarball to this bug report, we'll try to understand what's happening there.

Waiting for you reply I'm setting the status of this bug report to Incomplete; please set it back to New after commenting back and we'll look at it again. Thanks!

ubuntu-server-builder commented 1 year ago

Launchpad user Andrew Bogott(andrewbogott) wrote on 2021-02-08T03:19:45.311881+00:00

I am able to reproduce the issue on 20.4.1. I get good runs in 20.2.

A bisect shows:

root@cloudinit:/home/labtestandrew/cloud-init# git bisect good ef041fd822a2cf3a4022525e942ce988b1f95180 is the first bad commit commit ef041fd822a2cf3a4022525e942ce988b1f95180 Author: Ryan Harper ryan.harper@canonical.com Date: Fri Aug 14 12:51:54 2020 -0500

user-data: only verify mime-types for TYPE_NEEDED and x-shellscript (#511)

Commit d00126c167fc06d913d99cfc184bf3402cb8cf53 regressed cloud-init
handling in multipart MIME user-data.  Specifically, cloud-init would
examine the payload of the MIME part to determine what the content
type and subsequently which handler to use.  This meant that user-data
which had shellscript payloads (starts with #!) were always handled
as shellscripts, rather than their declared MIME type and affected
when the payload was handled.

One failing scenario was a MIME part with text/cloud-boothook type
declared and a shellscript payload.  This was run at shellscript
processing time rather than boothook time resulting in an change in
behavior from previous cloud-init releases.

To continue to support known scenarios where clouds have specifed
a MIME type of text/x-shellscript but provided a payload of something
other than shellscripts, we're changing the lookup logic to check for
the TYPES_NEEDED (text/plain, text/x-not-multipart) and only
text/x-shellscript.

It is safe to check text/x-shellscript parts as all shellscripts must
include the #! marker and will be detected as text/x-shellscript types.
If the content is missing the #! marker, it will not be excuted.  If
the content is detected as something cloud-init supports, such as
 #cloud-config the appropriate cloud-init handler will be used.

This change will fix hanldling for parts which were shellscripts but
ran with the wrong handler due to ignoring of the provided mime-type.

LP: #1888822
ubuntu-server-builder commented 1 year ago

Launchpad user Andrew Bogott(andrewbogott) wrote on 2021-02-08T03:20:58.816409+00:00

collect-logs fails in my current setup; I can investigate that further when I get a chance.

Note that https://bugs.launchpad.net/cloud-init/+bug/1795933 seems to be a duplicate of this issue, suggesting that the problem is present in released versions :(

ubuntu-server-builder commented 1 year ago

Launchpad user Dan Watkins(oddbloke) wrote on 2021-02-09T15:04:35.475315+00:00

collect-logs failing suggests to me that you may have a misconfigured system, rather than this being a cloud-init bug per se. Can you manually paste cloud-init.log from an affected system (and/or the collect-logs issue you're seeing)?

(https://bugs.launchpad.net/cloud-init/+bug/1795933 was filed in 2018, well before any of the referenced commits, and is in reference to an older version of Jinja, 2.2.1. The oldest version in Debian, in oldoldstable, is 2.7.3, so I suspect this is a separate issue.)

ubuntu-server-builder commented 1 year ago

Launchpad user Andrew Bogott(andrewbogott) wrote on 2021-02-12T01:10:46.571817+00:00

Thanks for your comments, all! A colleague and I worked on this some yesterday and I think I understand what's happening now.

The change in behavior relates to the interaction between mime-types and the #headings for a given part. In my particular use, the section looks like this:


--XXXXboundary text
MIME-Version: 1.0
Content-Type: text/cloud-config; charset="us-ascii"

## template: jinja
#cloud-config

Prior to ef041fd822a2cf3a4022525e942ce988b1f95180, that section was scanned by find_ctype() and identified as a jinja2 template. After that patch, the call to find_ctype() is bypassed and, hence, the jinja2 template rendering is skipped.

If I change this block to Content-Type: text/plain or to text/jinja2, the jinja is rendered as expected.

It would almost certainly be harmless to add text/cloud-config as one of the types that gets passed to find_ctype() but it might not be worth the change since I doubt you have a lot of users out there using the same weird combination of types and tags that I was using. If you want to close this as 'invalid' I won't object.

Thanks again!

ubuntu-server-builder commented 1 year ago

Launchpad user Andrew Bogott(andrewbogott) wrote on 2021-03-24T19:59:39.074295+00:00

Update: I've now learned that text/plain and text/jinja2 don't work in the version of cloud-init shipped with Debian Stretch (7.9.2). So in order to have a config work in both old and new versions we need to support text/cloud-config. I will submit a patch for that shortly.

ubuntu-server-builder commented 1 year ago

Launchpad user Launchpad Janitor(janitor) wrote on 2021-05-24T04:17:23.862720+00:00

[Expired for cloud-init because there has been no activity for 60 days.]