Closed hidrol closed 4 years ago
Hi @hidrol , did you set-up work correctly after running the script (without any further changes)?
I would like to determine if the pre-compiled acrn.efi binary used is at fault or if it is potentially a problem with the version you compiled yourself?
@gvancuts After I ran the script I only checked the bootorder via efibootmgr -v and everything seemed fine, then I rebooted. I will try again to compile with Clear Linux Version 33050
Thanks for the confirmation! Do I understand you correctly that you did not update ACRN at all manually after running the script? So your installation never booted under ACRN successfully?
I ran the script twice in native Clear Linux. After the first time I copied the compiled acrn.efi file via $ sudo cp ~/acrn-hypervisor/build/hypervisor/acrn.efi /usr/lib/acrn/acrn.nuc6cayb.industry.efi Then ran the the script again.
Did you reboot in between? I would like to understand if the set-up was ever functional at some point.
One of the potential issue with what you did is that the acrn.efi
is now out of sync with the user-space tools (acrn-dm
) and that can be problematic.
Just for clarity: the kernel error messages you see, are these from the Service MV (SOS) kernel, or the kernel from the User VM (UOS) you are attempting to launch?
No, I didn't reboot in between. The kernel error messages are from the Service VM kernel.
OK, can you try with the pre-compiled acrn.efi
, I would like to make sure we can correctly boot under ACRN using the precompiled binaries. Once that works, we can look at upgrading acrn.efi
and its user-space components (e.g. acrn-dm
).
The /usr/lib/acrn
folder didn't contain a precompiled binary named acrn.nuc6cayb.industry.efi
for the nuc6cayh. That's why I followed the 'Build ACRN from Source' guide in order to compile the acrn.efi
file for the nuc6cayh.
You're right, I forgot that this was the case... d'oh!
Could you update Clear Linux, this will update the Service VM kernel (SOS) and bring us closer to a combination that has been tested.
Is there a specific version you would recommend? Or just the newest?
I tested with version 33050, and got the same result.
Thanks @hidrol , this is the Clear Linux version that contains ACRN v1.6.1 (the latest as of today)... let us try to test this on our side to reproduce the failure.
@ttzeng , is this something you could take a look at by any chance?
My first goal is to run 2 user vm in parallel besides the service vm. I unterstood that the sdc scenario can only run one user vm and one service vm. Is that correct?
Yes, the scenario you should pick depends a bit on the type of User VM you want to start. Our documentation is not fully up-to-date on this unfortunately, but take a look here for a description of the various scenarios. And do let us know if some of the terms used do not make sense to you.
@gvancuts thank you very much for your answer. We currently have the NUC6CAYH with 8GB RAM and 240GB SSD. We would like to gain some experience with the ACRN Hypervisor for future industrial projects. With the link you sent us, the hybrid and industrial scenario is the most interesting for us. Unfortunately we can't run a Pre-Launch VM because we have the NUC6CAYH without serial port. Therefore the hybrid scenario will most likely be dropped. So only the industry scenario is the only scenario left to us.
But now we have problems to get this scenario running.
During the configuration I followed the following guides: Build ACRN from Source Getting Started Guide for ACRN Industry Scenario
Since the acrn.efi file for our hardware is not available in the bundle "sevice-os", I have compiled the acrn.efi file from the xml files. Our NUC has 8gb ram, so I changed the ram sizes (physical and service ram) in the industry.xml to 0x200000000. I did the same on sdc scenario, with no problems there.
Is there something I could try to avoid issue #4820? I would like to try the "out of box method".
@hidrol Why do you need to change the ram size? The default setting of ACRN can support up to 16G memory(beyond this value will be ignored). Could you please try again with not change the ram size? Thanks!
@hidrol yes, there is no pre-complired binary (acrn.nuc6cayb.industry.efi) for industrial scenario on APL NUC, we will try it soon, and keep you updated. @fuzhongl
@fuzhongl @terryzouhao I tried again with default RAM settings, unfortunately I got the same result as before. I attached a bash script with the commands I used for the installation, so maybe you can check it to see if there is something wrong. The installation is only successful if I run the script in sdc mode (with -s as argument instead of -i). clear_acrn_install.sh.tar.gz
@hidrol Which ACRN branch/tag are you using? Could you please share kernel cmdline? which is under EFI partition: loader/entries/*.conf
Thanks!
Hi @hidrol we just verified v1.6.1 with APL NUC, it can boot CL as SOS successfully: commit 3c64d59a18240637153942df5f647fed85ea69ac (HEAD -> release_1.6, tag: v1.6.1, origin/release_1.6), detailed step for your reference:
But if you are using latest master branch, there is a similar known #gitissue4822 (memory address confliction between multi-vms and HV log), So current workaround is change hvlog to 0XE00000 in SOS default cmdline: cat /mnt/loader/entries/Clear-linux-iot-lts2018-sos-4.19.120-108.conf title Clear Linux OS linux /EFI/org.clearlinux/kernel-org.clearlinux.iot-lts2018-sos.4.19.120-108 initrd /EFI/org.clearlinux/freestanding-00-intel-ucode.cpio initrd /EFI/org.clearlinux/freestanding-i915-firmware.cpio.xz options root=PARTUUID=6e240165-00e0-425e-b320-2af7866ea8e6 quiet console=tty0 console=ttyS0,115200n8 consoleblank=0 cryptomgr.notests hvlog=2M@0x1FE00000 change to-> hvlog=2M@0xE00000 i915.avail_planes_per_pipe=0x01010F i915.domain_plane_owners=0x011111110000 i915.enable_guc=0 i915.enable_gvt=1 i915.nuclear_pageflip=1 ignore_loglevel init=/usr/lib/systemd/systemd-bootchart intel_iommu=igfx_off memmap=2M$0x1FE00000 no_timer_check no_timer_check noreplace-smp rcu_nocbs=0-64 rcupdate.rcu_expedited=1 rootfstype=ext4,btrfs,xfs rootwait tsc=reliable rw
@terryzouhao Thanks to your help we were now able to get the SOS in industry scenario running. So the next step is to get both User VMs running with Clear Linux. Unfortunately I'm kinda stuck at getting the first User VM running with the launch_uos_id3.sh script. I opened a new Issue under
@hidrol, good too see you already boot CL as SOS successfully. Now you even failed to boot first guest VM? so you are using v1.6.1 CL33050 with sample launch VM script right. Actually "launch_uos_id3.sh script" is sample reference script for multi-guest vm, we would recommend to use config-tool to generate customized scenarios. we already tried in mainline with 2 CL VMs on APL NUC, will share details in #4855 to you later.
@terryzouhao I generated the launch_uos_id3.sh script with:
$ cd acrn-hypervisor
$ export board_file=$PWD/misc/acrn-config/xmls/board-xmls/nuc6cayh.xml
$ export scenario_file=$PWD/misc/acrn-config/xmls/config-xmls/nuc6cayh/industry.xml
$ export launch_file=$PWD/misc/acrn-config/xmls/config-xmls/nuc6cayh/industry_launch_6uos.xml
$ python misc/acrn-config/launch_config/launch_cfg_gen.py --board $board_file --scenario $scenario_file --launch $launch_file --uosid 0
Is this the right approach?
@hidrol Could you please confirm which branch are you using: master or v1.6(with tag v1.6.1)?
It is correct for master branch; but have a bug about UUID, please remove it form launch script for workaround. That means only one UOS. This issue will be fixed in branch v2.0.
For v1.6, the right one is
export launch_file=$PWD/misc/acrn-config/xmls/config-xmls/nuc6cayh/industrylaunch2
uos.xml
@fuzhongl I'm using master branch. Did I understand you right that currently it is not possible to run two UOS with Clear Linux? And which branch should I use then?
@hidrol The preferred branch is v1.6
You can use following .xml for two UOS.
export launch_file=$PWD/misc/acrn-config/xmls/config-xmls/nuc6cayh/industry_launch_2uos.xml
Beyond two UOS, this is bug for NUC6CAYH with master branch.
@fuzhongl @terryzouhao With the two scripts generated launch_uos_id1.sh
and launch_uos_id2.sh
I can only launch Windows and Preemt RT Linux is that right? Running two Clear Linux as guest vm isn't possible then? Also doesn't the launch_uos_id2.sh
use nvme passthrough? My NUC doesn't have a second drive.
@fuzhongl @terryzouhao With the two scripts generated
launch_uos_id1.sh
andlaunch_uos_id2.sh
I can only launch Windows and Preemt RT Linux is that right? Running two Clear Linux as guest vm isn't possible then? Also doesn't thelaunch_uos_id2.sh
use nvme passthrough? My NUC doesn't have a second drive.
@hidrol
launch_uos_id1.sh
Just remove --windows \
and change to the path of your image:
-s 3,virtio-blk,./win10-ltsc.img \
launch_uos_id2.sh
Please refer the sample launch script:
https://github.com/projectacrn/acrn-hypervisor/blob/master/devicemodel/samples/nuc/launch_uos.sh
you can replace nvme passthrough with virtio-blk
change to the path of your .img
-s 3,virtio-blk,/home/clear/uos/uos.img \
and remove
--lapic_pt \
--rtvm \
BTW: if you don't need UI for UOS, please remove -s 2,pci-gvt -G "$2" \
in launch script.
Thanks!
@terryzouhao @fuzhongl Thanks for your help so far. I was able to get Clear Linux running with launch_uos_id2.sh
. But for launch_uos_id1.sh
it didn't work. Did I understand you correctly that I can run Clear Linux with launch_uos_id1.sh
? My launch_uos_id1.sh
looks like this:
[...]
mem_size=4096M
#interrupt storm monitor for pass-through devices, params order:
#threshold/s,probe-period(s),intr-inject-delay-time(ms),delay-duration(ms)
intr_storm_monitor="--intr_monitor 10000,10,1,100"
#logger_setting, format: logger_name,level; like following
logger_setting="--logger_setting console,level=4;kmsg,level=3;disk,level=5"
acrn-dm -A -m $mem_size -s 0:0,hostbridge -U d2795438-25d6-11e8-864e-cb7a18b34643 \
$logger_setting \
-s 3,virtio-blk,./clearlinux2.img \
-s 7,virtio-net,tap_WaaG \
-s 2,passthru,0/2/0,gpu \
--ovmf /usr/share/acrn/bios/OVMF.fd \
$intr_storm_monitor \
-s 1:0,lpc \
-l com1,stdio \
$boot_audio_option \
$vm_name
}
launch_windows 1
@hidrol Any error log? Could you please try the sample script? https://github.com/projectacrn/acrn-hypervisor/blob/master/devicemodel/samples/nuc/launch_uos.sh
Thanks!
@fuzhongl The sample script did work, also launch_uos_id2.sh
did work. Running two clear Linux user VMs in parallel did not work, though I'm still not sure if it's actually possible with launch file industry_launch_2uos.xml
I used.
@hidrol Need to check the UUID used in your launch script. Could you please share your launch script? Thanks!
@fuzhongl I tested the same script with several UUIDs 495ae2e5-2603-4d64-af76-d4bc5a8ec0e5: worked d2795438-25d6-11e8-864e-cb7a18b34643: failed 615db82a-e189-4b4f-8dbb-d321343e4ab3: failed
Is this the information you need?
@hidrol Please also share parameters of acrn-dm. Which UUID is used in your sample script? Thanks!
@fuzhongl The sample script found in /usr/share/acrn/samples/nuc/
doesn't have any UUID. The sample script does work. The acrn-dm section of launch_uos.sh
looks like this:
[...]
acrn-dm -A -m $mem_size -s 0:0,hostbridge \
-s 2,pci-gvt -G "$2" \
-s 5,virtio-console,@stdio:stdio_port \
-s 6,virtio-hyper_dmabuf \
-s 3,virtio-blk,/home/clear/uos/uos.img \
-s 4,virtio-net,tap0 \
-s 7,virtio-rnd \
--ovmf /usr/share/acrn/bios/OVMF.fd \
$pm_channel $pm_by_vuart $pm_vuart_node \
$logger_setting \
--mac_seed $mac_seed \
$vm_name
}
[...]
my launch_uos_id1.sh
script looks like this
#!/bin/bash
# board: NUC6CAYH, scenario: INDUSTRY, uos: WINDOWS
# pci devices for passthru
declare -A passthru_vpid
declare -A passthru_bdf
passthru_vpid=(
["audio"]="8086 5a98"
["gpu"]="8086 5a85"
)
passthru_bdf=(
["audio"]="0000:00:0e.0"
["gpu"]="0000:00:02.0"
)
function tap_net() {
# create a unique tap device for each VM
tap=$1
tap_exist=$(ip a | grep "$tap" | awk '{print $1}')
if [ "$tap_exist"x != "x" ]; then
echo "tap device existed, reuse $tap"
else
ip tuntap add dev $tap mode tap
fi
# if acrn-br0 exists, add VM's unique tap device under it
br_exist=$(ip a | grep acrn-br0 | awk '{print $1}')
if [ "$br_exist"x != "x" -a "$tap_exist"x = "x" ]; then
echo "acrn-br0 bridge aleady exists, adding new tap device to it..."
ip link set "$tap" master acrn-br0
ip link set dev "$tap" down
ip link set dev "$tap" up
fi
}
function launch_windows()
{
vm_name=post_vm_id$1
tap_net tap_WaaG
#check if the vm is running or not
vm_ps=$(pgrep -a -f acrn-dm)
result=$(echo $vm_ps | grep -w "${vm_name}")
if [[ "$result" != "" ]]; then
echo "$vm_name is running, can't create twice!"
exit
fi
#echo ${passthru_vpid["gpu"]} > /sys/bus/pci/drivers/pci-stub/new_id
#echo ${passthru_bdf["gpu"]} > /sys/bus/pci/devices/${passthru_bdf["gpu"]}/driver/unbind
#echo ${passthru_bdf["gpu"]} > /sys/bus/pci/drivers/pci-stub/bind
modprobe pci_stub
kernel_version=$(uname -r)
audio_module="/usr/lib/modules/$kernel_version/kernel/sound/soc/intel/boards/snd-soc-sst_bxt_sos_tdf8532.ko"
# use the modprobe to force loading snd-soc-skl/sst_bxt_bdf8532
if [ ! -e $audio_module ]; then
modprobe -q snd-soc-skl
modprobe -q snd-soc-sst_bxt_tdf8532
else
modprobe -q snd_soc_skl
modprobe -q snd_soc_tdf8532
modprobe -q snd_soc_sst_bxt_sos_tdf8532
modprobe -q snd_soc_skl_virtio_be
fi
audio_passthrough=0
# Check the device file of /dev/vbs_k_audio to determine the audio mode
if [ ! -e "/dev/vbs_k_audio" ]; then
audio_passthrough=1
fi
boot_audio_option=""
if [ $audio_passthrough == 1 ]; then
# for audio device
echo ${passthru_vpid["audio"]} > /sys/bus/pci/drivers/pci-stub/new_id
echo ${passthru_bdf["audio"]} > /sys/bus/pci/devices/${passthru_bdf["audio"]}/driver/unbind
echo ${passthru_bdf["audio"]} > /sys/bus/pci/drivers/pci-stub/bind
boot_audio_option="-s 0:14:0,passthru,00/0e/0"
else
boot_audio_option="-s 0:14:0,virtio-audio"
fi
mem_size=4096M
#interrupt storm monitor for pass-through devices, params order:
#threshold/s,probe-period(s),intr-inject-delay-time(ms),delay-duration(ms)
intr_storm_monitor="--intr_monitor 10000,10,1,100"
#logger_setting, format: logger_name,level; like following
logger_setting="--logger_setting console,level=4;kmsg,level=3;disk,level=5"
acrn-dm -A -m $mem_size -s 0:0,hostbridge -U d2795438-25d6-11e8-864e-cb7a18b34643 \
$logger_setting \
-s 5,virtio-blk,/home/clear/clearlinux.img \
-s 7,virtio-net,tap_WaaG \
-s 2,passthru,0/2/0,gpu \
--ovmf /usr/share/acrn/bios/OVMF.fd \
$intr_storm_monitor \
-s 1:0,lpc \
-l com1,stdio \
$boot_audio_option \
$vm_name
}
launch_windows 1
The error log:
acrn-br0 bridge aleady exists, adding new tap device to it...
logger: name=console, level=4
logger: name=kmsg, level=3
logger: name=disk, level=5
SW_LOAD: get ovmf path /usr/share/acrn/bios/OVMF.fd, size 0x200000
interrupt storm monitor params: 10000, 10, 1, 100
vm_create: post_vm_id1
VHM api version 1.0
vm_setup_memory: size=0x100000000
open hugetlbfs file /run/hugepage/acrn/huge_lv1/post_vm_id1/D279543825D611E8864ECB7A18B34643
open hugetlbfs file /run/hugepage/acrn/huge_lv2/post_vm_id1/D279543825D611E8864ECB7A18B34643
level 0 free/need pages:1/1 page size:0x200000
level 1 free/need pages:2/4 page size:0x40000000
to reserve more free pages:
to reserve pages (+orig 2): echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
now enough free pages are reserved!
try to setup hugepage with:
level 0 - lowmem 0x0, biosmem 0x200000, highmem 0x0
level 1 - lowmem 0x80000000, biosmem 0x0, highmem 0x80000000
total_size 0x200000000
mmap ptr 0x0x7f5c98db5000 -> baseaddr 0x0x7f5cc0000000
mmap 0x80000000@0x7f5cc0000000
touch 2 pages with pagesz 0x40000000
mmap 0x80000000@0x7f5e00000000
touch 2 pages with pagesz 0x40000000
mmap 0x200000@0x7f5dbfe00000
touch 1 pages with pagesz 0x200000
really setup hugepage with:
level 0 - lowmem 0x0, biosmem 0x200000, highmem 0x0
level 1 - lowmem 0x80000000, biosmem 0x0, highmem 0x80000000
vm_init_vdevs
No correct pm notify channel given
start monitor interrupt data...
polling 37...
Listening 37...
pci init hostbridge
pci init lpc
pci init passthru
modify_bar_registration: bypass for pci-passthru 0:2.0
modify_bar_registration: bypass for pci-passthru 0:2.0
modify_bar_registration: bypass for pci-passthru 0:2.0
pci init virtio-blk
pci init virtio-net
pci init passthru
modify_bar_registration: bypass for pci-passthru 0:e.0
modify_bar_registration: bypass for pci-passthru 0:e.0
pci init igd-lpc
tpm: init_vtpm2:Invalid socket path!
write virt-0:e.0 in dsdt for HDAS @ 00:e.0
/tmp/dm.XslWKzv 24: Device (PCI0)
Warning 3073 - Multiple types ^ (Device object requires either a _HID or _ADR, but not both)
/tmp/dm.XslWKzv 772: Method (ADBG, 1, Serialized)
Remark 2146 - ^ Method Argument is never used (Arg0)
/tmp/dm.XslWKzv 954: Name (GBUF, Buffer (0x10){})
Remark 2173 - ^ Creation of named objects within a method is highly inefficient, use globals or method local variables instead (\_SB.PCI0.HDAS.ACCG)
/tmp/dm.XslWKzv 980: Processor (CPU0, 0x00, 0x00000000, 0x00) {}
Warning 3168 - ^ Legacy Processor() keyword detected. Use Device() keyword instead.
acrn_sw_load
SW_LOAD: partition blob /usr/share/acrn/bios/OVMF.fd size 2097152 copy to guest 0xffe00000
SW_LOAD: build e820 9 entries to addr: 0x7f5cc00ef008
SW_LOAD: entry[0]: addr 0x0000000000000000, size 0x00000000000a0000, type 0x1
SW_LOAD: entry[1]: addr 0x00000000000a0000, size 0x0000000000060000, type 0x2
SW_LOAD: entry[2]: addr 0x0000000000100000, size 0x000000007ff00000, type 0x1
SW_LOAD: entry[3]: addr 0x0000000080000000, size 0x0000000008000000, type 0x2
SW_LOAD: entry[4]: addr 0x00000000db000000, size 0x0000000004000000, type 0x2
SW_LOAD: entry[5]: addr 0x00000000df000000, size 0x0000000001000000, type 0x2
SW_LOAD: entry[6]: addr 0x00000000e0000000, size 0x0000000020000000, type 0x2
SW_LOAD: entry[7]: addr 0x0000000100000000, size 0x0000000040000000, type 0x2
SW_LOAD: entry[8]: addr 0x0000000140000000, size 0x0000000080000000, type 0x1
SW_LOAD: ovmf_entry 0xfffffff0
add_cpu
Unhandled ps2 mouse command 0xe1
packet_write_wait: Connection to 192.168.2.124 port 22: Broken pipe
@fuzhongl launch_uos1.sh looks like this:
[...]
acrn-dm -A -m $mem_size -s 0:0,hostbridge -U d2795438-25d6-11e8-864e-cb7a18b34643 \
$logger_setting \
-s 5,virtio-blk,/home/clear/uos/uos.img \
-s 7,virtio-net,tap_WaaG \
-s 2,passthru,0/2/0,gpu \
--ovmf /usr/share/acrn/bios/OVMF.fd \
-s 1:0,lpc \
-l com1,stdio \
$boot_audio_option \
$vm_name
}
launch_windows 1
[...]
@fuzhongl After running the script launch_uos1.sh
I get blackscreen. After that I can still login in sos via ssh, but only for a while after the nuc reboots.
@fuzhongl @terryzouhao In order to change the build version to v1.6 I compiled acrn.efi with acrn-hypervisor version v1.6 then replaced the acrn.efi of my efi partition and rebooted. Is this correct or do I need to take additional steps in order to change the branch to v1.6?
@fuzhongl @terryzouhao In order to change the build version to v1.6 I compiled acrn.efi with acrn-hypervisor version v1.6 then replaced the acrn.efi of my efi partition and rebooted. Is this correct or do I need to take additional steps in order to change the branch to v1.6?
This is correct but it's also important to update the user-space components (such as acrn-dm
) in the Service VM. If you compiled the new version of ACRN in the Service VM directly, you can update those by running sudo make install
Another question, if I want to run two post user VMs in parallel, do I just have two run the two launch scripts (launch_uos_id1.sh and launch_uos_id2.sh) from the service OS, or do I need to use acrnctl?
@gvancuts I did not compile the acrn.efi on the service VM directly, so what do I have to do instead?
@gvancuts I did not compile the acrn.efi on the service VM directly, so what do I have to do instead?
Manually copy over the acrn-dm
file from your dev machine (build/devicemodel
) to /usr/bin
in your Service VM. I do not believe you use the other binaries (yet)
@gvancuts @fuzhongl @terryzouhao Two Clear Linux User VMs are finally running in parallel. Thank you all for your support!
@gvancuts @fuzhongl @terryzouhao Two Clear Linux User VMs are finally running in parallel. Thank you all for your support!
Great!! 👍
cool, great to see 2 VMs eventually launched on APL NUC ! Thanks @gvancuts and @fuzhongl 's support.
Moving forward, @hidrol could you have a followup mail to me: terry.zou@intel.com (did not find your mail address in github). Just want to followup with you of further scenario/document on APL NUC, and welcome to join our TCM/mailinglist to discuss those requirements and design : )
After running the script with "$ sudo ./acrn_quick_setup.sh -s 31470 -d -i" and rebooting after, I get this error: [...] Kernel panic - not syncing: Fatal Exception Kernel Offset: 0xb000000 from 0xffffffff81000000 (relocating range: [...] Rebooting in 10 seconds
I have a nuc6cayh with 8GB ram and updated bios version AYAPLCEL.86A.0066.2020.0107.1027. I changed sos ram size and platform ram size to 0x200000000 in industry.xml. The default settings in the industry.xml file have no values for HV ram start and HV ram size, I'm not sure what to do about that. The acrn.efi file was built from the xml files with:
$ make BOARD_FILE=$PWD/misc/acrn-config/xmls/board-xmls/nuc6cayh.xml \ SCENARIO_FILE=$PWD/misc/acrn-config/xmls/config-xmls/nuc6cayh/industry.xml FIRMWARE=uefi
My Clear Linux Version is 31470.