omniosorg / omnios-extra

Packages for OmniOS extra
https://omnios.org
Other
26 stars 58 forks source link

virtualbox crashes system #744

Closed beiDei8z closed 3 years ago

beiDei8z commented 3 years ago

summary

Attempts to start virtual machines under VirtualBox crashes the whole system when run on Supermicro A2SDi-16C-HLN4F motherboard. Running on an older Supermicro A1SAi-2750F is sucessful. The diffenerce is the motherboard / CPU / RAM.

details

hardware that fails

Supermicro A2SDi-16C-HLN4F motherboard, onboard C3955 CPU, 16 cores 128GB RAM, 4 x Samsung M393A4K40CB1-CRC4Q

method

Fresh install of r151036, ssh server on, no root passwd, no user console login via iVVM/HTML5 service management processor, login as root:

# zfs set atime=off rpool
# dladm rename-link ixgbe0 net0
# ipadm create-if net0
# ipadm create-addr -T static -a local=192.168.1.36/24 net0/v4
# cp /etc/nsswitch.dns /etc/nsswitch.conf
# echo "nameserver 192.168.1.66" > /etc/resolv.conf
# pkg install zsh
# zfs create rpool/export/home/james
# useradd -g staff -s /bin/zsh -d /export/home/james -u 1001 james
# passwd james
# chown james:staff /export/home/james

Remote terminal:

$ ssh 192.168.1.36
$ su
# pkg install pkg:/package/pkg
# pkg update -v
# init 6
$ ssh 192.168.1.36
$ su
# zfs create -o mountpoint=/virtualbox rpool/virtualbox
# groupadd -g 120 vbox
# useradd -c "VirtualBox" -g vbox -s /bin/zsh -d /virtualbox -u 1120 vbox
# chown vbox:vbox /virtualbox
# pkg install -v ooce/virtualization/virtualbox
# init 6
$ ssh 192.168.1.36
$ su
# su - vbox
$ mkdir -p .config/VirtualBox/
$ cat > .config/VirtualBox/VirtualBox.xml << EOF
<?xml version="1.0"?>
<VirtualBox xmlns="http://www.virtualbox.org/" version="1.12-solaris">
  <Global>
    <MachineRegistry/>
    <NetserviceRegistry>
      <DHCPServers/>
    </NetserviceRegistry>
    <SystemProperties defaultMachineFolder="/virtualbox" defaultHardDiskFormat="VDI" VRDEAuthLibrary="VBoxAuth" webServiceAuthLibrary="VBoxAuth" LogHistoryCount="3" proxyMode="0" exclusiveHwVirt="false"/>
    <USBDeviceFilters/>
  </Global>
</VirtualBox>
EOF
$ VBoxManage createvm --name junk --register
$ VBoxManage startvm junk --type headless

...system crash

https://www.xdrv.uk/0e100928f9d37e7ce0a6ca505216071c/vmdump.0.7z md5sum in URL, 134706258 bytes.

I have the system on a spare HDD so can resume the OS. terminal.log crash 1

citrus-it commented 3 years ago

Could you try updating the virtualbox package on there and get another crash dump please? The updated package is not fixed but has more debugging symbols in it which will make it easier to find the problem.

beiDei8z commented 3 years ago
Changed packages:
extra.omnios
  ooce/virtualization/virtualbox
    6.1.16-151036.0:20201025T130734Z -> 6.1.16-151036.0:20201208T104347Z

https://www.xdrv.uk/8428c5788a34ef44b33e6e66b81a0dd4/vmdump.1.7z 132187001 bytes

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

beiDei8z commented 3 years ago

Not fixed. Do not close.

beiDei8z commented 3 years ago

Machine update including VB 6.1.18. No improvement.

% pkg info virtualbox
             Name: ooce/virtualization/virtualbox
          Summary: VirtualBox
      Description: VirtualBox is a general-purpose full virtualiser for x86
                   hardware, targeted at server, desktop and embedded use.
            State: Installed
        Publisher: extra.omnios
          Version: 6.1.18
           Branch: 151036.0
   Packaging Date: 25 January 2021 at 19:51:54
Last Install Time:  7 December 2020 at 17:41:40
 Last Update Time:  1 February 2021 at 11:49:09
             Size: 99.70 MB
             FMRI: pkg://extra.omnios/ooce/virtualization/virtualbox@6.1.18-151036.0:20210125T195154Z

vbox@omnios:~% VBoxManage createvm --name junk --register
Virtual machine 'junk' is created and registered.
UUID: cb0f83ca-bc89-49fe-a152-2d39f5eeb699
Settings file: '/virtualbox/junk/junk.vbox'
vbox@omnios:~% VBoxManage startvm junk --type headless
Waiting for VM "junk" to power on...
Read from remote host 192.168.1.36: Connection reset by peer
Connection to 192.168.1.36 closed.

crash-6 1 18

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

beiDei8z commented 3 years ago

My Supermicro A2SDi-16C-HLN4F motherboard is being used for test and experiment until May when I will commission it for front line use with r151038 LTS. After this I will not want to test bugs that bring the whole system down.

My current plan is to give up on VirtualBox on this machine and instead use an LX zone to access a modern web browser. I can use the Oracle provided VirtualBox packages elsewhere for occasional OS tests.

oetiker commented 3 years ago

note that chrome for example does not work on LX as cgroup support are not sufficiently advanced

beiDei8z commented 3 years ago

Firefox 86.0 is running on Ubuntu 14.04 in LX. 78.8.0esr runs on Debian 9 LX. That gives me access to awkward websites such as github.

No doubt 5 minutes in the future some child will enhance Javascript and a new method of ordering groceries online will be needed - oh vaccinations done and that particular one is soon no longer to be needed.

I can run VB elsewhere but not with the backing of 128GB of RAM which my Omnios server has. I only need one solution of the many possible. Running a web browser native would be best, FF won't compile - the rot has set in with rust. Palemoon and all the X/GTK3 stuff compiles but I can't get it to run as yet.

beiDei8z commented 3 years ago

virtualbox.org Solaris packages run on Solaris 10. Solaris 10 packages run on OmniOS. With minor package edits I can run the virtualbox.org package on OmniOS. The result is a similar system crash. There are at least 2 problems but I conclude not with the OmniOS packaging of VirtualBox.

  1. The Virtualbox software crashes on this CPU/motherboard.
  2. The whole OmniOS system crashes when VB has a fault.

Error 1 is VB's. Error 2 might be because process isolation is not possible when poking around with virtualisation - VT-x etc? A system that one can crash is acceptable although not desirable, a system that [apparently] spontaneously crashes is not. This crash is avoidable by not running Virtualbox.

OT: I have bhyve running better now and it acceptably provides a web browser (eg, using now for github). The VNC viewer is restricted in size (win for VB), it ought to be possible to pass rfb=w=...h-... but I can't see how using zonecfg and zoneadm. ssh -X is better for running one window not a desktop view. LX zones run better, show the individual processes and best of all use the global ZFS. The LX zones here: https://downloads.omniosce.org/media/lx/ fail to run gdk-pixbuf

...
Processing triggers for libgdk-pixbuf2.0-0:amd64 (2.40.0+dfsg-3ubuntu0.2) ...

(process:5109): GLib-ERROR **: 17:18:08.455: getauxval () failed: No such file or directory
Trace/breakpoint trap (core dumped)
...

I'm guessing that getauxval() is not fully/correctly implemented by the LX zone. The older Linuxes from https://images.joyent.com/images do work. Old Linuxes for now run the latest firefox so a browser is much the same with a new OS on bhyve or an old one on LX.

beiDei8z commented 3 years ago

I will be installing r151038 LTS on this hardware very soon. Here are some more/final VB tests before I commission the machine.

I tried VB 6.1.22 using the VB packages on Omnios. Same crash.

I installed Solaris 11.4 plus the same 6.1.22 VB supplied package. Success, no crash. [In case anyone stumbles here and cares: S11.4 is not compatible with Supermicro A2SDi motherboard, no network, I used an add-in card. S11.3 does not boot.]

The MB supports VB's virtualisation.

The VB supplied package and the Omnios pkg both work on my other hardware on Omnios.

The problem only occurs when VB is combined with Omnios and this MB.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.