cloudbase / windows-imaging-tools

Tools to automate the creation of a Windows image for OpenStack, supporting KVM, Hyper-V, ESXi and more.
Apache License 2.0
670 stars 227 forks source link

Windows 10 Image #336

Closed dacron closed 4 years ago

dacron commented 4 years ago

This isn't really an issue, but more of a request for guidance. I'm trying to build a Windows 10 Enterprise image which will ultimately be for use with MAAS from the following configuration file:

[DEFAULT]
wim_file_path=E:\Sources\install.wim
image_name=Windows 10 Enterprise
image_path=C:\images\my-windows-image.raw.tgz
virtual_disk_format=RAW
image_type=MAAS
disk_layout=BIOS
force=False
install_maas_hooks=True
compression_format=tar.gz
enable_administrator_account=False
shrink_image_to_minimum_size=True
enable_custom_wallpaper=True
disable_first_logon_animation=False
compress_qcow2=False
zero_unused_volume_sectors=False
extra_packages_ignore_errors=False
enable_shutdown_without_logon=False
enable_ping_requests=False
enable_ipv6_eui64=False
enable_active_mode=False
[vm]
administrator_password=<removed>
external_switch=external
cpu_count=4
ram_size=4294967296
disk_size=32212254720
disable_secure_boot=False
[drivers]
drivers_path=C:\cloudbase\drivers
[updates]
install_updates=False
purge_updates=True
clean_updates_offline=False
clean_updates_online=True
[sysprep]
run_sysprep=True
unattend_xml_path=UnattendTemplate.xml
disable_swap=True
persist_drivers_install=False
[cloudbase_init]
beta_release=False
serial_logging_port=COM1
cloudbase_init_use_local_system=False

I've removed blank or "" elements from the configuration for pasting here just to reduce the line count.

The issue I'm having is that the image is produced, imported in to MAAS, is deployed to a host, boots, but then fails to finish booting as it fails sysprep.

From the above configuration, is there anything obvious that I'm missing? To short circuit testing the deployment (import to MAAS takes a while) I've been converting the RAW tar.gz disk back to VHDX and launching it in Hyper-V. Same result: sysprep fails.

dacron commented 4 years ago

I also attach WINDOWS\Panther\unattend.xml from the resulting image below.

<?xml version='1.0' encoding='utf-8'?>
<unattend xmlns="urn:schemas-microsoft-com:unattend">
  <settings pass="generalize" wasPassProcessed="true">
    <component name="Microsoft-Windows-PnpSysprep" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <PersistAllDeviceInstalls>False</PersistAllDeviceInstalls>
    </component>
  </settings>
  <settings pass="oobeSystem">
    <component name="Microsoft-Windows-Shell-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State">
      <OOBE>
        <HideEULAPage>true</HideEULAPage>
        <NetworkLocation>Work</NetworkLocation>
        <ProtectYourPC>1</ProtectYourPC>
        <SkipMachineOOBE>true</SkipMachineOOBE>
        <SkipUserOOBE>true</SkipUserOOBE>
      </OOBE>
    </component>
  </settings>
  <settings pass="specialize">
    <component name="Microsoft-Windows-Deployment" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <RunSynchronous>
        <RunSynchronousCommand wcm:action="add">
          <Order>2</Order>
          <Path>cmd.exe /c ""C:\Program Files\Cloudbase Solutions\Cloudbase-Init\Python\Scripts\cloudbase-init.exe" --config-file "C:\Program Files\Cloudbase Solutions\Cloudbase-Init\conf\cloudbase-init-unattend.conf" &amp;&amp; exit 1 || exit 2"</Path>
          <Description>Run Cloudbase-Init to set the hostname</Description>
          <WillReboot>OnRequest</WillReboot>
        </RunSynchronousCommand>
        <RunSynchronousCommand xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns="">
          <Order>1</Order>
          <Path>"C:\Windows\System32\reg.exe" ADD "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /v "PagingFiles" /d "?:\pagefile.sys" /f</Path>
          <Description>Set page file to be automatically managed by the system</Description>
          <WillReboot>Never</WillReboot>
        </RunSynchronousCommand>
      </RunSynchronous>
    </component>
  </settings>
</unattend>
calvinzwu commented 4 years ago

Hi dacron, I think I am facing the same problem as yours. The host boots well, but keep rebooting and I get the error message as attached. IMG_2635

dacron commented 4 years ago

It looks like you get further than I do. I'm stuck at first boot once the image has been deployed with:

image

dacron commented 4 years ago

@calvinzwu are you able to share your config.ini?

ader1990 commented 4 years ago

@dacron @calvinzwu you are hitting a known issue with cloudbase-init 0.9.11. Please use the latest version (beta). In the config, set:

[cloudbase_init]
beta_release=True

Thank you, Adrian Vladu

ader1990 commented 4 years ago

The fix in cloudbase-init landed a while back, but the stable installer uses an older code: https://github.com/cloudbase/cloudbase-init/commit/a8ab3abc569be4c11dd63a9c67e6f0964f53d4cf

ader1990 commented 4 years ago

@dacron for generating win10 enterprise, did you use https://github.com/cloudbase/windows-openstack-imaging-tools/pull/293 ? It has a recent fix that you might have needed.

dacron commented 4 years ago

@ader1990 yes I'm pulling from master. I have a build going now but once it has finished I'll try with the beta_release flag set to true and report report back. Thanks!

dacron commented 4 years ago

My current build failed with the following message in setuperr.log:

image

Going to try again now with beta_release=True

ader1990 commented 4 years ago

@dacron for future reference, the logs you want are both in the C:\Windows\Panther\UnattendGC\ and C:\Windows\Panther\ folders. Both folders contain setupact/setuperr files.

dacron commented 4 years ago

@ader1990 still failing with the same error. I attach debug.zip containing config.ini, C:\Windows\Panther (and sub-directories), and C:\Windows\System32\Sysprep (and sub-directories).

debug.zip

ader1990 commented 4 years ago

At this moment, I suggest to set disable_swap= False, and if it still fails, to run in a cmd when the sysprep fails the following commands:

cmd.exe /c ""C:\Program Files\Cloudbase Solutions\Cloudbase-Init\Python\Scripts\cloudbase-init.exe" --config-file "C:\Program Files\Cloudbase Solutions\Cloudbase-Init\conf\cloudbase-init-unattend.conf"
"C:\Windows\System32\reg.exe" ADD "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /v "PagingFiles" /d "?:\pagefile.sys" /f

To start the cmd.exe when sysprep has failed you need to press keys: Shift + F10

ader1990 commented 4 years ago

I have tried to reproduce the issue, but with no success. I used ISO en_windows_10_enterprise_version_1703_updated_march_2017_x64_dvd_10189290.

dacron commented 4 years ago

I think i've figured out what the issue is, but I don't know how to fix it within MAAS or the imaging tools. The issue is that the volume is full. This results in cloudbase-init.exe failing as there is no disk space.

image

ader1990 commented 4 years ago

@dacron in this case, set:

[default]
# Whether to shrink the image partition and disk after the image generation is complete.
shrink_image_to_minimum_size=False

The image will have the root partition size approximately equal to

[vm]
disk_size=<bytes>
ader1990 commented 4 years ago

If disable_swap=True, this issue should not be present though, as the shrinking leaves a generous size buffer and swap is enabled at the next boot time. On instances with large RAM size, the swap eats a lot of disk space at the first boot (proportional to the RAM size), that s why the option to disable_swap and reenable it after the first boot.

dacron commented 4 years ago

Odd. disable_swap was set to True. I'll try setting the shrink_image_to_minimum_size flag. Is the expected behavior than when deploying via MAAS that Curtin/Cloud-init consumes the rest of the available disk?

ader1990 commented 4 years ago

The Windows swap usually consumes that space, not curtin/cloudbase-init.

calvinzwu commented 4 years ago

config.zip

@dacron Hi There, sorry for the late reply due to different timezone. I believe my installation went as far as yours, this error message occurred before Windows installation reboots (fail attempt); it was a short and quick message right before its reboot. And after several fail attempts, it would show the message (screen print) that you had posted. And here is my config.ini file, I didn't compress the image and keep it in RAW to see the content inside though.

ader1990 commented 4 years ago

config.zip

@dacron Hi There, sorry for the late reply due to different timezone. I believe my installation went as far as yours, this error message occurred before Windows installation reboots (fail attempt); it was a short and quick message right before its reboot. And after several fail attempts, it would show the message (screen print) that you had posted. And here is my config.ini file, I didn't compress the image and keep it in RAW to see the content inside though.

Please try with the suggestions proposed before to @dacron.

calvinzwu commented 4 years ago

@dacron @ader1990 Hey guys, It worked, finally! Here are the changes I made:

1,[DEFAULT] shrink_image_to_minimum_size=False

2, [cloudbase_init] beta_release=True

Thank you very much for all your help! @ader1990 @dacron

dacron commented 4 years ago

Works for me too if I set beta_release=True and shrink_image_to_minimum_size=False. Curious to understand why we need to set shrink_image_to_minimum_size to False though. This seems like it could be a bug in Cloudbase-init? I would think that Cloudbase-init ought to add something to the sysprep (as the first command to be run) along the lines of Resize-Partition -DriveLetter C -Size $(Get-PartitionSupportedSize -DriveLetter C).SizeMax which may fix this?

ader1990 commented 4 years ago

The problem is different -- it is recommended to shrink the partition to the minimum size plus a small buffer, so that the images are not too big.

By default, cloudbase-init should run during sysprep and extend the partition (using the plugin ExtendVolumes) - doing exactly what you said here: 'Resize-Partition -DriveLetter C -Size $(Get-PartitionSupportedSize -DriveLetter C).SizeMax?'

On bare-metal or even virtual environments, The problem that can arise lies in the Windows swap implementation, which creates a file proportional to the RAM size and eat all the buffer (that s why there is the option to temporarily disable the swap).

dacron commented 4 years ago

Going the close this issue. But I've been able to get it working with the recommended configuration changes. Thanks.

ryan1336 commented 3 years ago

@dacron Are you still using win10ent with MAAS? I'm struggling to get it to boot as well