canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.98k stars 880 forks source link

Cloudinit issue with legacy CentOS 8 stream #5673

Open JalenMak6 opened 2 months ago

JalenMak6 commented 2 months ago

Bug report

version: cloud-init-23.4-7.el8.3.noarch kernel version: 4.18.0-553.6.1.el8.x86_64

Basically I am using Packer to build the image which has cloud-init baked in the OS. When the VM is provisioned in Foreman via image-base method, the cloud-init should get the userdate template and cloud-init template from Foreman and apply onto the VM. It works with Centos 9 and Rocky9 so far.

When I configure the cloud init with CentOS 8, it keeps trying to create the default user ‘admin’ where those parameters assigned to the user are in the wrong object type.

When I run cloud-init schema --system

error: Invalid user-data /var/lib/cloud/instances/i-bdf3f54642b2d80f8e/cloud-config.txt Error: Cloud config schema errors: users.1.passwd: None is not of type ‘string’, users.1.sudo: [‘ALL=(ALL) ALL’] is not of type ‘boolean’, users.1.sudo: [‘ALL=(ALL) ALL’] is not of type ‘string’, ‘null’

Here is my config, It has this part in my cloud-config.txt users:

default groups: users lock-passwd: false name: admin passwd: null primary-group: admin shell: /bin/bash sudo: ALL=(ALL) ALL … which I think it comes from nowhere. I checked the cloud-init template for this VM in foreman, it does not have this section created

for my cloud.cfg in ks.cfg, I did not create any users. Here is the sample of my ks.cfg

cat << EOM > /etc/cloud/cloud.cfg.d/01_network.cfg network: config: disabled EOM

cat << EOM > /etc/cloud/cloud.cfg.d/10_datasource.cfg datasource_list: [NoCloud] datasource: NoCloud: seedfrom: http://foreman1.abc.ca/userdata/ EOM

cat << EOM > /etc/cloud/cloud.cfg

cloud-config

cloud_init_modules:

cloud_config_modules:

cloud_final_modules:

system_info: distro: centos paths: cloud_dir: /var/lib/cloud templates_dir: /etc/cloud/templates ssh_svcname: sshd

EOM

From my config, I did not create any default user. But when the VM is booted, I checked the cloud-init-output.log, I can see that there is an error mentioned above.

I have no idea why it keeps creating the weird default user and breaking my VM to retrieve the cloud-init template from Foreman. I tried to remove the user directives under the /var/lib/cloud/instances/i-36d64040f731b093d7/cloud-config.txt

However, if I run cloud-init init or cloud-init clean --reboot, there is still no config applied to CentOS stream 8. I could see all the userdata and cloud-config.txt needed to apply on the host but nothing is applied.

I also checked the cloud-init-output.log, it was running but it is not running the config from userdata/cloudinit template. from cloud-init.log, there is no ERR message either. For the same config, I could apply on CentOS 9 and Rocyk 9(some twists)

Steps to reproduce the problem

Environment details

cloud-init logs

catmsred commented 2 months ago

Thank you for reporting this issue!

Are you able to replicate the behaviour outside of the Packer/Foreman/VMWare environment? There are a lot of moving pieces to your scenario that make reproducing the problem somewhat challenging.

For context on the user key, here is the documentation on users and groups. Note that if you do not provide a users key in your userdata, the distro-specific default user will be created. What happens if you define the user(s) you want on the VM via userdata?

JalenMak6 commented 2 months ago

I defined the users directives in cloud.cfg before. I also define the default user under the system_info directive. However, when I run cloud-init init, it still tries to create the 'admin' user which causes the error above.

catmsred commented 1 month ago

Can you provide your userdata?

Are you running cloud-init init again after the VM is booted or is this the first time?

JalenMak6 commented 1 month ago

Hi catmsred,

Sure, here is the default userdata built-in in Foreman as the Userdata open-vm-tools `<%# kind: user_data name: UserData open-vm-tools testing model: ProvisioningTemplate oses:

identity: LinuxPrep: domain: <%= @host.domain %> hostName: <%= @host.shortname %> hwClockUTC: true timeZone: <%= host_param('time-zone') || 'UTC' %>

globalIPSettings: dnsSuffixList: [<%= @host.domain %>] <%- @host.interfaces.each do |interface| -%> <%- next unless interface.subnet -%> dnsServerList: [<%= interface.subnet.dns_servers.join(', ') %>] <%- end -%>

nicSettingMap: <%- @host.interfaces.each do |interface| -%> <%- next unless interface.subnet -%>

The host can get those information like IP, DNS and hostname

Also, there is a cloud-init default template which will be applied to the VM built via foreman, it works with CentOS 9 and Rocky 9 `<%# kind: cloud-init name: CloudInit default model: ProvisioningTemplate oses:

cloud-config

hostname: <%= @host.name %> fqdn: <%= @host %> manage_etc_hosts: true users: {} runcmd:

Please let me know if any information needed. The cloud.cfg is in the first comment.

For the CentOS 9 and Rocky9 , I don't need to create the users directives/default user in my loud.cfg and it still works. No default user is created.

catmsred commented 1 month ago

So you see the same issue if the cloud-init userdata

#cloud-config
hostname: <%= @host.name %>
fqdn: <%= @host %>
manage_etc_hosts: true
users: {}
runcmd:
    |
    <%= indent(2) { snippet 'fix_hosts' } -%>
    |
    <%= indent(2) { snippet 'yum_proxy' } -%>
    |
    <%= indent(2) { snippet 'ntp' } -%>
    |
    <% if rhel_compatible && host_param_true?('enable-epel') -%>
    <%= indent(2) { snippet 'epel' } -%>
    <% end -%>
    |
    <%= indent(2) { snippet 'RHEL Subscription Manager' } -%>
    |
    <%= indent(2) { snippet 'remote_execution_ssh_keys' } %>
    |
    <%= indent(2) { snippet 'realm-join testing' } %>
    |
    <%= indent(2) { snippet 'Remove default local user' } %>
    |
    <%= indent(2) { snippet 'Puppet Agent Configuration' } %>
    phone_home:
    url: <%= foreman_url('built') %>
    post: []
    tries: 10

is used absent foreman/packer/etc? If we can create a minimalist reproducer it will be easier to track down the specific source of the bug and passing an empty user dict should not create any users.

Do you see any cloud-init logs pertaining to the creation of the admin user? Feel free to share the whole log file if you can.

JalenMak6 commented 1 month ago

basically the userData is being used by foreman as I have the exact setup with Rocky9 and CentOS9, here is the initial cloud-init-output.log

Cloud-init v. 23.4-7.el8.3 running 'modules:config' at Fri, 06 Sep 2024 21:06:05 +0000. Up 18.40 seconds.
Cloud-init v. 23.4-7.el8.3 running 'modules:final' at Fri, 06 Sep 2024 21:06:05 +0000. Up 18.82 seconds.
2024-09-06 21:06:33,040 - util.py[WARNING]: Failed to post phone home data to http://foreman1.abc.ca/unattended/built in 10 tries
Cloud-init v. 23.4-7.el8.3 running 'init-local' at Fri, 06 Sep 2024 21:20:05 +0000. Up 4.50 seconds.
Cloud-init v. 23.4-7.el8.3 running 'init' at Fri, 06 Sep 2024 21:20:11 +0000. Up 10.97 seconds.
ci-info: ++++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++++
ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
ci-info: | Device |  Up  |           Address           |      Mask     | Scope  |     Hw-Address    |
ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
ci-info: | ens192 | True |         10.0.50.145         | 255.255.254.0 | global | 00:50:56:8e:17:ad |
ci-info: | ens192 | True | fe80::250:56ff:fe8e:17ad/64 |       .       |  link  | 00:50:56:8e:17:ad |
ci-info: |   lo   | True |          127.0.0.1          |   255.0.0.0   |  host  |         .         |
ci-info: |   lo   | True |           ::1/128           |       .       |  host  |         .         |
ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
ci-info: ++++++++++++++++++++++++++++Route IPv4 info++++++++++++++++++++++++++++
ci-info: +-------+-------------+-----------+---------------+-----------+-------+
ci-info: | Route | Destination |  Gateway  |    Genmask    | Interface | Flags |
ci-info: +-------+-------------+-----------+---------------+-----------+-------+
ci-info: |   0   |   0.0.0.0   | 10.0.50.1 |    0.0.0.0    |   ens192  |   UG  |
ci-info: |   1   |  10.0.50.0  |  0.0.0.0  | 255.255.254.0 |   ens192  |   U   |
ci-info: +-------+-------------+-----------+---------------+-----------+-------+
ci-info: +++++++++++++++++++Route IPv6 info+++++++++++++++++++
ci-info: +-------+-------------+---------+-----------+-------+
ci-info: | Route | Destination | Gateway | Interface | Flags |
ci-info: +-------+-------------+---------+-----------+-------+
ci-info: |   1   |  fe80::/64  |    ::   |   ens192  |   U   |
ci-info: |   3   |  multicast  |    ::   |   ens192  |   U   |
ci-info: +-------+-------------+---------+-----------+-------+
2024-09-06 21:20:11,973 - schema.py[WARNING]: Invalid cloud-config provided: Please run 'sudo cloud-init schema --system' to see the schema errors.

If I run cloud-init init on the host again after boot up, it will run the same same and gave me this error. When I run cloud-init schema --system the error is shown below:

error: Invalid user-data /var/lib/cloud/instances/i-bdf3f54642b2d80f8e/cloud-config.txt Error: Cloud config schema errors: users.1.passwd: None is not of type ‘string’, users.1.sudo: [‘ALL=(ALL) ALL’] is not of type ‘boolean’, users.1.sudo: [‘ALL=(ALL) ALL’] is not of type ‘string’, ‘null’

here is my existing config,

cloud.cfg:

# users:
# - default

cloud_init_modules:
  - bootcmd
  - ssh

cloud_config_modules:
  - runcmd

cloud_final_modules:
  - scripts-per-once
  - scripts-per-boot
  - scripts-per-instance
  - phone-home

system_info:
  distro: centos
  // Default user name + that default users groups (if added/used)
#  default_user:
#    name: cloudinituser
#   doas:
#      - permit nopass cloudinituser
#    lock_passwd: True
#   gecos: Ubuntu
#    groups: [adm, wheel]
#    sudo: ["ALL=(ALL) NOPASSWD:ALL"]
#    shell: /bin/bash
  paths:
    cloud_dir: /var/lib/cloud
    templates_dir: /etc/cloud/templates
  ssh_svcname: sshd

  [root@devc8la1 ~]# cat /etc/cloud/cloud.cfg.d/10_datasource.cfg
datasource_list: [NoCloud]
datasource:
  NoCloud:
    seedfrom: http://foreman1.abc.ca/userdata/

[root@devc8la1 ~]# cat /etc/cloud/cloud.cfg.d/01_network.cfg
network:
  config: disabled

basically the comment part are not needed in Rocyk9 and Centos9 config, but I tried it on the CentOS 8 as well. But none of them work either.

the default user creation appeared under my /var/lib/cloud/instances/i-xxxxxxx/cloud-config.txt and /var/lib/cloud/instances/i-xxxxx/user-data.txt

/var/lib/cloud/instances/i-xxxxx/cloud-config.txt
#cloud-config

# from 1 files
# part-001

---
fqdn: centos8-prod-q1-default
groups:
- admin
hostname: centos8-prod-q1-default
manage_etc_hosts: true
phone_home:
    post: []
    tries: 10
    url: http://foreman01.abc.ca/unattended/built
runcmd:
- 'dnf -y install subscription-manager yum-utils

.......

user-data.txt
#cloud-config
hostname: centos8-prod-q1-default.  <this is my template vm hostname
fqdn: centos8-prod-q1-default
manage_etc_hosts: true
ssh_pwauth: true
groups:
- admin
users:
- default
- name: admin
  primary-group: admin
  groups: users
  shell: /bin/bash
  sudo: ['ALL=(ALL) ALL']
  lock-passwd: false
  passwd:
catmsred commented 1 month ago

Can you provide the contents of /var/log/cloud-init.log? That will show the cloud-init local process which is where the merged cloud config is created.

Unfortunately I don't have access to a CentOS8 env to try to reproduce but if you can (or can't) get the same behavior without using foreman/packer that would help narrow down where the misconfigured userdata is coming from.