Closed vesubramanian closed 1 year ago
You linked my repo - https://gitHub.com/rainpole/packer-vsphere - have you tried the HCL based examples provided?
Ryan
It does however look like the guest OS is failing to install and I'd suggest reviewing the user-data configuration.
I generate mine on demand m but check out https://github.com/rainpole/packer-vsphere/blob/main/builds/linux/ubuntu-server-20-04-lts/http/user-data.pkrtpl.hcl.
Ryan
@tenthirtyam I pointed @vesubramanian to yours and my repo as examples in another case and suggested opening a new one here.
@vesubramanian what is your boot command set to in packer?
Try posting you user-data file here also, try to keep the formatting.
I tried the boot commands and user data from all 3 above. Of course, I used the JSON version for ubuntu2004 file instead of .hcl version, since I am more comfortable with json. Will it make a difference? Anyways, I am sharing the builder section and user-data below.
`#cloud-config
autoinstall:
version: 1
locale: en_US
keyboard:
layout: en
variant: us
network:
network:
version: 2
ethernets:
ens192:
dhcp4: true
storage:
layout:
name: lvm
identity:
hostname: my-ubuntu
username: runner
password:
"builders": [
{
"type": "vsphere-iso",
"vcenter_server": "myVCServerName",
"username": "myVCUser@vsphere.local",
"password": "myVCPwd",
"insecure_connection": "true",
"datacenter": "myDC",
"cluster": "myCluster/myCluster",
"datastore": "datastore1",
"guest_os_type": "ubuntu64Guest",
"CPUs": 1,
"RAM": 1024,
"RAM_reserve_all": false,
"disk_controller_type": "pvscsi",
"storage": {
"disk_size": 15000,
"disk_thin_provisioned":true
},
"network_adapters": {
"network": "myNetworkName",
"network_card": "vmxnet3"
},
"vm_name": "my-ubuntu",
"notes": "Built via Packer",
"convert_to_template": true,
"ssh_username": "runner",
"ssh_password": "myPwd",
"ssh_timeout": "20m",
"ssh_handshake_attempts": "100",
"iso_paths": ["[datastore1] ISO/ubuntu-20.04.3-live-server-amd64.iso"],
"cd_files": ["{{template_dir}}/http/user-data", "{{template_dir}}/http/meta-data"],
"cd_label": "cidata",
"boot_wait": "2s",
"boot_command": [
"<enter><wait2><enter><wait><f6><esc><wait>",
" autoinstall<wait2> ds=nocloud;",
"<wait><enter>"
],
"shutdown_command": "echo 'runner'|sudo -S shutdown -P now"
}
]
Also, it will be great, if you can please let me know the commands to generate the encrypted password and also the public SSH key. I am not sure whether I am generating these properly or not.
Password: openssl passwd -6
Public Key: ssh-keygen -t ecdsa -b 521
Just quickly looking over, may be wrong (cant see the formatting), but your user-data has "network: network:" (should be 1) The network section should be like:
network:
version: 2
ethernets:
ens192:
dhcp4: true
Also with your choice of using json, you may wish to change that to HCL, HCL is the preferred language from 1.7.
Password:
openssl passwd -6
Public Key:
ssh-keygen -t ecdsa -b 521
Thank you very much. Will try this. In the above command for Password, "passwd" is the sample password, right?
@vesubramanian "passwd" isnt the sample, its the command to be executed.
openssl passwd -6 YOUR_PASSWORD
Just quickly looking over, may be wrong (cant see the formatting), but your user-data has "network: network:" (should be 1) The network section should be like:
network: version: 2 ethernets: ens192: dhcp4: true
Also with your choice of using json, you may wish to change that to HCL, HCL is the preferred language from 1.7.
Thanks a lot for your response. Will check this. However, the JSON that I am trying to use is from a different github repository, which has a big provisioner section already defined in json. I am not sure how to convert this to hcl.
To convert from json to HCL: https://learn.hashicorp.com/tutorials/packer/hcl2-upgrade?in=packer/configuration-language
I am actually trying to build a self hosted github linux runner, following the below link. However, they are using packer and building Azure Image. In my case, I am trying to build a vSphere-ISO. They are also using json. https://github.com/actions/virtual-environments/blob/main/images/linux/ubuntu2004.json
So, I am using that same json with the same provisioners section and changing only builders section. Is there any issue using json? Please don't misunderstand. I am just curious.
Just quickly looking over, may be wrong (cant see the formatting), but your user-data has "network: network:" (should be 1) The network section should be like:
network: version: 2 ethernets: ens192: dhcp4: true
Also with your choice of using json, you may wish to change that to HCL, HCL is the preferred language from 1.7.
Fixed it, but no difference.
Is there any issue using json? Please don't misunderstand. I am just curious.
There is no issue, its just that if you are starting now and do not have a load of code already done, HCL is probably where you should start as json is not the preferred / developed language anymore for packer. But if you are wanting to just get something working and know that it will have limitations in the future possibly requiring a rewrite, then ok, it's just something to keep in mind.
Now with the problem.
I haven't had a chance to test yet, but what would make it easier is if you could put what you are using in your github and then we can test it out.
Let me explain all the details. I have an Ubuntu 20.04.3 VM. In that I am performing the following steps. Login as root and create a user called runner useradd -m -p $(openssl passwd -crypt myPassword) runner && sudo usermod -a -G sudo runner && su - runner exec bash
sudo apt update sudo apt -y install apt-transport-https ca-certificates curl software-properties-common curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add - sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main" sudo apt update sudo apt upgrade sudo apt install packer
wget -q https://packages.microsoft.com/config/ubuntu/18.04/packages-microsoft-prod.deb sudo dpkg -i packages-microsoft-prod.deb sudo apt-get update sudo add-apt-repository universe sudo apt-get install -y powershell pwsh Install-Module -Name Az -AllowClobber -Scope CurrentUser
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
sudo apt install git
git clone https://github.com/actions/virtual-environments.git chmod 777 -R virtual-environments
On my Windows VM, I modify the ubuntu2004.json and created a folder called http. Inside http folder, created an empty meta-data file and user-data file. Updated the user-data file on Windows VM. (will provide the content of ubuntu2004.json and user-data file below). Path of ubuntu2004.json on GitHub is https://github.com/actions/virtual-environments/blob/main/images/linux/ubuntu2004.json Copied the locally modified files from Windows VM to the Ubuntu server using the below commands. pscp .\ubuntu2004-updated.json runner@IP-Address:/home/runner/virtual-environments/images/linux/ubuntu2004-updated.json pscp -r .\http runner@IP-Address:/home/runner/virtual-environments/images/linux
Then I run the following command. sudo packer build -force virtual-environments/images/linux/ubuntu2004-updated.json
ubuntu2004-updated.json.txt user-data.txt
Note: I have changed the extensions to txt to allow upload. Please change it at your end.
Hope this clarifies. This is my requirement. GitHub is using Packer + Azure DevOps to build Azure Image. In my case, I need to build vSphere-ISO using packer. Please help.
First thing you need to do is in your user-data you need to change
network:
version: 2
ethernets:
ens33:
dhcp4: true
to
network:
version: 2
ethernets:
ens192:
dhcp4: true
this will allow the network setup to complete and ssh working. I haven't checked the provisioner section so there may be more. See if that works.
Thank you. Tried it, but still the same error.
Although the packer output waits for SSH to become available, when I look the Web Console of VSphere, the error (also seen in the above screen shot) is "command ['systemd-cat', '--level-prefix=false', '--identifier=subiquity_log.2014', '.snap/subiquity/2651/usr/bin/python3', '-m', 'curtin', '--showtrace', '-c', '/var/log/installer/subiquity-curtin-install.conf', 'install'] returned non-zero exit status 3.". Not sure what is causing this or how to fix this.
When I tested, I removed the provisoners. Changed the networking to the above, filled in my details needed, removed the data center value, cleaned up the indents, it then installed Ubuntu, network was setup, it installed the ssh server and open-vm-tools ran the post commands in the cloud init config and shutdown the server.
Before changing the network config, it was stuck waiting for ssh, which it would be as interface ens33 didn't exist on an esxi vm, its 192 for the first one.
Remove the provisoners, see if it then works. If not, when I get home, I will add a repo with what worked for me. You test, if it didn't work, there is something else that is a problem also, not related to packer or cloud init config.
Should I remove the provisioners section itself? Is it not mandatory? Or I should have empty provisioners section? Can you please explain more on "cleaned up indents"? I didn't get it. How to check what ens to use?
Meanwhile, I will try and let you know.
You can remove it all, its not needed (for testing). The builders section is all that is needed to test the install / creation of the OS. Cleaning up the indents (did it in the packer file), making the indents all tabs instead of some spaces and some tabs. So consistent indentation.
The ens value, yeah, all you want to know is, its 192 for vmware. That value is based upon where the card is on the PCI(e) bus, port number etc, its the "predictable" naming scheme (deterministic naming scheme). It can be changed back to the old naming scheme so its eth0, but that can only be done after the install in grub (unless someone knows how to do it beforehand).
Edit: I see that rainpole/packer-vsphere uses ens33, I wonder has that changed for ESXi7 @tenthirtyam ? Or something else, 33 is a PCI location, 192 is a PCIe location (start at 160 I believe), so maybe ens33 is for the e1000 and 192 is for vmxnet3.
Thank you once again. I tried both (with empty provisioners section and no provisioners section). Getting the same error. Removed the Datacenter value also. I tried with ens192 as well. No change. Also, the subiquity error is because of any of these or it is because of something else?
As a test, just to see if there is something else with the setup you have causing it. Have you tried one of the deploys that I have. To just see if it works.
Sorry for asking. Did you already share that with me? Not sure if I missed it. If you don't mind, can you please share it again?
Thank you. I believe I need to use the file at https://github.com/dbond007/Packer/blob/master/ubuntu_base/variables.20.04.pkr.hcl
I won't need any user-data?
The user-data
is seen in etc/http
directory.
See: https://github.com/dbond007/Packer/blob/master/ubuntu_base/etc/http/user-data
Thank you. Please excuse me for my ignorance. I copied the pkr.hcl file into the directory and also the user-data files into the appropriate directory. However, when I run the below command, it is not working. It keeps saying prompting with options. Not sure what I am doing wrong.
packer build -var-file=variables.20.04.pkr.hcl
Since I am running with a non root user, I even tried the above command with sudo, but no luck.
clone the repo.
in the ubuntu_base rename the variables.pkrvars.hcl.example to variables.pkrvars.hcl
in the variables.pkrvars.hcl fill in your correct details (leave the SSH details as thats whats in the user-data).
the in the ubuntu_base directory run
packer build -only="*20.04*" -var-file=variables.pkrvars.hcl .
The -only="20.04" will make it build only ubuntu 20.04 It will download the ISO from ubuntu automatically. it should take around 7 minutes (depending on the server its being built on) to complete.
Couple of questions
However, after following all the steps and running the packer build command, I got this error.
All variable descriptions are in variables.common.pkr.hcl But for you specific answers to the above: "VCenter" is you vcenter server, its ip address or FQDN "VCFolder" that is the folder that you want the template to be stored in in vcenter. You can place them in different places, i put them in one called templates but you could probably just have double " if you do not want to put it in a specific folder (not tested). Zone and environment you can leave, they haven't been implemented yet as my feature request hasn't been implemented. Everything that is in that file needs some kind of value.
The last error is what it asked. I didnt specify as i forgot that I added it for another test i did for someone elses problem. Run packer init . It will then download the newer vsphere packer plugins. If you dont want to do that, just delete packer.plugins.pkr.hcl and it will use whats built into packer.
sudo shouldnt be needed to run any of this, it isnt making any changes to your system.
you need to change these values to fit your environment:
VCenter = "10.0.0.151" : your vcenter server ip or FQDN
VCCluster = "Cluster1" : you cluster name
VCUser = "administrator@vsphere.local" : your vcenter user
VCPassword = "N0tS3cUr3" : your vcenter user password
VCDataStore = "NFS01" : the datastore where you want the VM to be put
VCNetwork = "dv-LAN" : the network in vcenter you want the VM on
VCFolder = "templates" : what folder to put the VM / template in
Thank you very much once again. Followed all your steps. Looks like Zone is required. So, I left the last 2 variables and values, as is. Cloned the code on my Ubuntu server on to the non root user's home directory. Gave full permissions to the cloned folder. Modified the user-data (for the user name and encrypted password) and variables hcl files locally in my Windows VM and copied these 2 files to the appropriate locations inside the Ubuntu server using pscp command. Then ran the packer init and then packer build. Ran into the same issue (as shown / mentioned above). Gets stuck at "Waiting for SSH to become available". In the web console, I could see the same issue. "command ['systemd-cat', '--level-prefix=false', '--identifier=subiquity_log.2096', '.snap/subiquity/2651/usr/bin/python3', '-m', 'curtin', '--showtrace', '-c', '/var/log/installer/subiquity-curtin-install.conf', 'install'] returned non-zero exit status 3."
Does it have anything to do with any setting in vSphere or is it something else? Please help.
If you could run it without changing anything other than the above variables to ensure no other changes that you may do are causing that.
Reverted the user-data file to match yours. However, in packer variables file, I have a couple of questions. The SSH User can be any user or only root? VSphere user is also a non-admin user, but has sufficient privileges to create VM. Is this fine?
Ok. Just finished running, following your advice. Still the same error.
The SSH user is the user you create in the user-data file, it should have root / sudo permissions if you want to do anything with it. The user in vsphere will not be causing this, you would get other errors with packer if it couldn't do what was needed. This is something specific to the installation of Ubuntu in your environment.
You may want to look here, nothing at the moment, but you appear to be having the same problem as here: https://discourse.ubuntu.com/t/issue-with-curtin-while-trying-to-autoinstall-ubuntu-20-04-focal/20046
Thank you once again for your prompt response. I checked that already and I have asked for help (see the last one from venh123). Meanwhile, I have a question. The SSH user is created and given sudo permissions by the script, during execution of packer build? If not, which user and password to use?
I tried this today from Windows VM also (for your repository), but I am getting the same error. I have a few more questions.
I am not a Network or OS expert. I somehow managed to convince them to create a user for me for this. Hence, I don't have an admin user for vSphere and don't have complete access.
@SwampDragons Can you please help?
1.There isn't really a minimum version of vsphere for this to work, installing the OS. There is a minimum version for the customisation by vmware when deploying from the template.
To completely exclude packer, limiting it to vsphere and Ubuntu, create an iso with the cloudinit files on it and mount it with the os on vmware in a new vm and do the same thing as you got packer to do so it auto installs. It will likely fail again exactly the same. If it does, it a bug in Ubuntu being caused by something, which to find out more you wound need to look in the log files. If it works there is something that is happening with packet in your environment that we can't replicate.
Thank you. I am not sure how to export the logs into my Windows VM so that I can look into it. I use Web Console. I have one more question. In some user-data files, I have seen the following line.
echo 'ubuntu ALL=(ALL) NOPASSWD:ALL' > /target/etc/sudoers.d/ubuntu
In the above line, "ubuntu" in both places is the user name?
Yes it is the username. Giving it sudo, with no password needed.
Thank you. I have seen many blogs/articles now. Every one is using their own boot command and user-data file. Not sure why these are different for everyone and not consistent.
They are different because there are different ways to do the same thing. The user-data file can to a lot of things and again you can do the same thing in many ways. You could use it to do everything if you wanted instead of using the provisioners in packer. It's down to preference, workflows and knowledge.
The way I have done mine is based upon trial and error when I started as I couldn't find any examples that worked reliably and did what I wanted, so with what I learnt I ended up with what I have. There will be other ways, probably better, but I didn't know at the time.
Ok. But what I am wondering is that there is no standard documentation. People can always customize according to their needs & convenience, but there should have been some basic documentation for people who just want to go with the basic settings and not tweak anything since they may not have expertise on Linux / Networking.
Anyway, can you please advise on how to get the log from the Web Console to my local Windows machine?
Also, how can I ensure whether my user-data is getting called or not. I saw somewhere that if the user-data has issues, then also, this error occurs.
I was able to figure out how to extract the crash report into my Windows VM. Also, I ran an interactive install. It was failing at "curtin command in-target" installation, although I am not sure about the reason. When I searched for it, I saw an article which suggested to disable network adapter before installation and re-enable it after installation. What if there is more than one? How to disable/enable all at once? Can this be done via user-data? Can you please help with this?
you should put the logs somewhere so I / we can look at them.
With regards to the disabling the network, you could try
sudo nmcli networking off
in the early commands
and
sudo nmcli networking on
in the late.
no idea if it will work.
But what that will do if it does work is it will make it so that it will not update and will not install open-vm-tools or openssh as the repos will not be available. So they will need to be added to the late commands after turning back on the network.
You should test if disabling the network will work, just disable it in vmware, if it installs after that then that is the problem for some reason, if not then this will unlikely help.
Thank you once again. I tried with simply "sudo ip link set ens192 down" in early commands and "sudo ip link set ens192 up" in late commands. Although it helped in the interactive session with no other detail in the user-data, when I updated the user-data with more detail and removed the interactive, I started encountering the same error. Anyways, I am attaching the crash log. I tried going through it, but couldn't understand much. Hope you find something from it. However, when I searched for exception, I found the following
curtin: Installation failed with exception: Unexpected error while running command. Command: ['/snap/subiquity/2651/bin/subiquity-configure-apt', '/snap/subiquity/2651/usr/bin/python3', 'true'] Exit code: 100 Reason: - Stdout: + '[' -z /target ']'
- PY=/snap/subiquity/2651/usr/bin/python3
- HAS_NETWORK=true
- /snap/subiquity/2651/usr/bin/python3 -m curtin apt-config
finish: cmd-install/stage-curthooks/001-configure-apt/cmd-in-target: FAIL: curtin command in-target curtin: Installation failed with exception: Unexpected error while running command. Command: ['/snap/subiquity/2651/bin/subiquity-configure-apt', '/snap/subiquity/2651/usr/bin/python3', 'true'] Exit code: 100 Reason: - Stdout: + '[' -z /target ']'
- PY=/snap/subiquity/2651/usr/bin/python3
- HAS_NETWORK=true
- /snap/subiquity/2651/usr/bin/python3 -m curtin apt-config
ERROR root:39 finish: subiquity/Install/install: FAIL: Command '['systemd-cat', '--level-prefix=false', '--identifier=subiquity_log.2090', '/snap/subiquity/2651/usr/bin/python3', '-m', 'curtin', '--showtrace', '-c', '/var/log/installer/subiquity-curtin-install.conf', 'install']' returned non-zero exit status 3.
Just quickly looking over, may be wrong (cant see the formatting), but your user-data has "network: network:" (should be 1) The network section should be like:
network: version: 2 ethernets: ens192: dhcp4: true
Also with your choice of using json, you may wish to change that to HCL, HCL is the preferred language from 1.7.
If you check the network section in the documentation from ubuntu, using 2 network attributes is correct.
Ubuntu Version : 20.04.3 (Focal) Packer Version: 1.7.4 vSphere Client Version: 6.7.0.48000 Builder Type: vSphere-iso
I have tried so many options and articles, but I couldn't build an ISO for Ubuntu 20.04.3. I get stuck at "Waiting for SSH server to become available". It will be great, if I can get a working example. Especially, I need the builder section inside ubuntu2004.json and the content of the user-data file which will be inside the http directory. It would be really nice if there is a step-by-step example, as I am not at all knowledgeable on Linux. Please help. I am also tagging @dbond007.
PFB a few links I followed, but didn't have any luck. https://github.com/dbond007/Packer/tree/master/ubuntu_base https://github.com/rainpole/packer-vsphere https://virtjo.com/2020/build-ubuntu-vm-with-packer-on-vsphere/
Upon monitoring the console, I had some observations. I am attaching some screen shots for reference.