TritonDataCenter / smartos-live

For more information, please see http://smartos.org/ For any questions that aren't answered there, please join the SmartOS discussion list: https://smartos.topicbox.com/groups/smartos-discuss
1.58k stars 247 forks source link

vmadm create fails - provisioning times out #261

Open robinbowes opened 11 years ago

robinbowes commented 11 years ago

I am unable to create native zones on my SmartOS install.

SunOS nas02 5.11 joyent_20130905T204057Z i86pc i386 i86pc

I am using this image:

0084dad6-05c1-11e3-9476-8f8320925eea base64 13.2.0 smartos 2013-08-15T15:30:00Z

I have also had trouble with this one:

fdea06b0-3f24-11e2-ac50-0b645575ce9d base64 1.8.4 smartos 2012-12-05T21:59:37Z

This is the json I'm using:

{ "autoboot": true, "brand": "joyent", "image_uuid": "0084dad6-05c1-11e3-9476-8f8320925eea", "max_physical_memory": 1024, "cpu_cap": 100, "alias": "fifo_ext", "quota": "40", "dns_domain": "robinbowes.com", "resolvers": [ "192.168.1.53" ], "nics": [ { "nic_tag": "admin", "gateway": "192.168.1.1", "ip": "192.168.1.21", "netmask": "255.255.255.0" } ] }

I then run vmadm create -f fifo.json

It sits there for around 5 minutes then errors with a message about not moving past provisioning:

While it's sitting there I can zlogin to the zone.

If I set the root password and zlogin to the console, then press Ctrl-D, I get this output:

Enter user name for system maintenance (control-d to bypass): WARNING: svccfg apply /etc/svc/profile/generic.xml failed svc.configd: Fatal error: Backend copy failed: opening /etc/svc/volatile/fast_repository.db: No such file or directory svc.configd: Fatal error: Backend copy failed: remove /etc/svc/repository.db-B8aIDK: Permission denied svc.configd: Fatal error: Backend copy failed: remove /etc/svc/volatile/fast_repository.db: No such file or directory Requesting System Maintenance Mode (See /lib/svc/share/README for more information.) svc:/system/early-manifest-import:default exited with status 95

If I just zlogin to the zone, and run svcs -xv, I get this:

[root@380e9520-da80-4341-9857-93ad7ce61f8f ~]# svcs -xv svc:/system/boot-archive:default has no "restarter" property group; ignoring. svc:/system/device/local:default has no "restarter" property group; ignoring. svc:/milestone/devices:default has no "restarter" property group; ignoring. svc:/system/identity:domain has no "restarter" property group; ignoring. svc:/system/identity:node has no "restarter" property group; ignoring. svc:/system/filesystem/local:default has no "restarter" property group; ignoring. svc:/system/manifest-import:default has no "restarter" property group; ignoring. svc:/system/filesystem/minimal:default has no "restarter" property group; ignoring. svc:/milestone/multi-user:default has no "restarter" property group; ignoring. svc:/milestone/name-services:default has no "restarter" property group; ignoring. svc:/network/initial:default has no "restarter" property group; ignoring. svc:/network/loopback:default has no "restarter" property group; ignoring. svc:/network/physical:default has no "restarter" property group; ignoring. svc:/network/physical:nwam has no "restarter" property group; ignoring. svc:/system/filesystem/root:default has no "restarter" property group; ignoring. svc:/milestone/single-user:default has no "restarter" property group; ignoring. svc:/system/filesystem/usr:default has no "restarter" property group; ignoring. svc:/network/rpc/bind:default has no "restarter" property group; ignoring. svc:/system/console-login:default has no "restarter" property group; ignoring. svc:/system/console-login:ttya has no "restarter" property group; ignoring. svc:/system/console-login:ttyb has no "restarter" property group; ignoring. svc:/system/console-login:vt2 has no "restarter" property group; ignoring. svc:/system/console-login:vt3 has no "restarter" property group; ignoring. svc:/system/console-login:vt4 has no "restarter" property group; ignoring. svc:/system/console-login:vt5 has no "restarter" property group; ignoring. svc:/system/console-login:vt6 has no "restarter" property group; ignoring. svc:/milestone/multi-user-server:default has no "restarter" property group; ignoring. svc:/system/utmp:default has no "restarter" property group; ignoring. svc:/milestone/sysconfig:default has no "restarter" property group; ignoring. svc:/milestone/network:default has no "restarter" property group; ignoring. svc:/network/routing/route:default has no "restarter" property group; ignoring. svc:/network/routing/legacy-routing:ipv4 has no "restarter" property group; ignoring. svc:/network/routing/legacy-routing:ipv6 has no "restarter" property group; ignoring. svc:/network/shares/group:default has no "restarter" property group; ignoring. svc:/network/shares/group:zfs has no "restarter" property group; ignoring. svc:/network/inetd-upgrade:default has no "restarter" property group; ignoring. svc:/network/inetd:default has no "restarter" property group; ignoring. svc:/network/ipv4-forwarding:default has no "restarter" property group; ignoring. svc:/network/ipv6-forwarding:default has no "restarter" property group; ignoring. svc:/network/security/ktkt_warn:default has no "restarter" property group; ignoring. svc:/network/nfs/rquota:default has no "restarter" property group; ignoring. svc:/network/nfs/cbd:default has no "restarter" property group; ignoring. svc:/network/nfs/mapid:default has no "restarter" property group; ignoring. svc:/network/nfs/client:default has no "restarter" property group; ignoring. svc:/network/nfs/nlockmgr:default has no "restarter" property group; ignoring. svc:/network/nfs/status:default has no "restarter" property group; ignoring. svc:/network/dns/client:default has no "restarter" property group; ignoring. svc:/network/ssh:default has no "restarter" property group; ignoring. svc:/network/nis/client:default has no "restarter" property group; ignoring. svc:/network/routing-setup:default has no "restarter" property group; ignoring. svc:/network/ldap/client:default has no "restarter" property group; ignoring. svc:/network/rpc/keyserv:default has no "restarter" property group; ignoring. svc:/network/rpc/gss:default has no "restarter" property group; ignoring. svc:/network/rpc/rex:default has no "restarter" property group; ignoring. svc:/network/service:default has no "restarter" property group; ignoring. svc:/system/cryptosvc:default has no "restarter" property group; ignoring. svc:/system/device/mpxio-upgrade:default has no "restarter" property group; ignoring. svc:/system/auditd:default has no "restarter" property group; ignoring. svc:/system/sac:default has no "restarter" property group; ignoring. svc:/system/name-service-cache:default has no "restarter" property group; ignoring. svc:/system/rcap:default has no "restarter" property group; ignoring. svc:/system/consadm:default has no "restarter" property group; ignoring. svc:/system/rmtmpfiles:default has no "restarter" property group; ignoring. svc:/system/system-log:default has no "restarter" property group; ignoring. svc:/system/sysidtool:net has no "restarter" property group; ignoring. svc:/system/sysidtool:system has no "restarter" property group; ignoring. svc:/system/coreadm:default has no "restarter" property group; ignoring. svc:/system/sar:default has no "restarter" property group; ignoring. svc:/system/cron:default has no "restarter" property group; ignoring. svc:/system/keymap:default has no "restarter" property group; ignoring. svc:/network/rpc-100235_1/rpc_ticotsord:default has no "restarter" property group; ignoring. svc:/network/dns/multicast:default has no "restarter" property group; ignoring. svc:/network/install:default has no "restarter" property group; ignoring. svc:/network/rexec:default has no "restarter" property group; ignoring. svc:/network/slp:default has no "restarter" property group; ignoring. svc:/system/filesystem/reparse:default has no "restarter" property group; ignoring. svc:/network/ipfilter:default has no "restarter" property group; ignoring. svc:/network/netmask:default has no "restarter" property group; ignoring. svc:/network/login:eklogin has no "restarter" property group; ignoring. svc:/network/login:klogin has no "restarter" property group; ignoring. svc:/network/login:rlogin has no "restarter" property group; ignoring. svc:/network/datalink-management:default has no "restarter" property group; ignoring. svc:/network/ipsec/manual-key:default has no "restarter" property group; ignoring. svc:/network/ipsec/ipsecalgs:default has no "restarter" property group; ignoring. svc:/network/ipsec/policy:default has no "restarter" property group; ignoring. svc:/network/ipsec/ike:default has no "restarter" property group; ignoring. svc:/network/routing/ndp:default has no "restarter" property group; ignoring. svc:/network/routing/ripng:default has no "restarter" property group; ignoring. svc:/network/routing/rdisc:default has no "restarter" property group; ignoring. svc:/network/ipqos:default has no "restarter" property group; ignoring. svc:/network/location:default has no "restarter" property group; ignoring. svc:/network/dns/install:default has no "restarter" property group; ignoring. svc:/network/shell:default has no "restarter" property group; ignoring. svc:/network/shell:kshell has no "restarter" property group; ignoring. svc:/network/ip-interface-management:default has no "restarter" property group; ignoring. svc:/network/iptun:default has no "restarter" property group; ignoring. svc:/system/svc/global:default has no "restarter" property group; ignoring. svc:/system/extended-accounting:flow has no "restarter" property group; ignoring. svc:/system/extended-accounting:net has no "restarter" property group; ignoring. svc:/system/extended-accounting:process has no "restarter" property group; ignoring. svc:/system/extended-accounting:task has no "restarter" property group; ignoring. svc:/system/logadm-upgrade:default has no "restarter" property group; ignoring. svc:/system/fmd:default has no "restarter" property group; ignoring. svc:/system/idmap:default has no "restarter" property group; ignoring. svc:/system/hotplug:default has no "restarter" property group; ignoring. svc:/system/device/allocate:default has no "restarter" property group; ignoring. svc:/system/rbac:default has no "restarter" property group; ignoring. svc:/system/vtdaemon:default has no "restarter" property group; ignoring. svc:/system/hostid:default has no "restarter" property group; ignoring. svc:/system/filesystem/smartdc:default has no "restarter" property group; ignoring. svc:/system/filesystem/autofs:default has no "restarter" property group; ignoring. svc:/system/pfexec:default has no "restarter" property group; ignoring. svc:/network/smb/client:default has no "restarter" property group; ignoring. svc:/pkgsrc/postfix:default has no "restarter" property group; ignoring. svc:/smartdc/mdata:execute has no "restarter" property group; ignoring. svc:/smartdc/mdata:fetch has no "restarter" property group; ignoring. svc:/system/zoneinit:default has no "restarter" property group; ignoring. svc:/network/vrrp:default has no "restarter" property group; ignoring. svc:/network/loadbalancer/ilb:default has no "restarter" property group; ignoring. svc:/system/early-manifest-import:default (early manifest import) State: offline since Thu Sep 12 23:05:19 2013 Reason: Unknown. See: http://illumos.org/msg/SMF-8000-AR See: man -M /usr/share/man -s 1M svc.startd See: man -M /usr/share/man -s 5 smf_method See: man -M /usr/share/man -s 5 smf See: man -M /usr/share/man -s 5 smf_bootstrap See: /var/svc/log/system-early-manifest-import:default.log Impact: This service is not running.

The content of /var/svc/log/system-early-manifest-import:default.log is:

svccfg: Not validating instance milestone/single-user:default because EMI's state is offline svccfg: Not validating instance network/vrrp:default because EMI's state is offline svccfg: Not validating instance network/physical:default because EMI's state is offline svccfg: Not validating instance network/inetd:default because EMI's state is offline svccfg: Not validating instance network/ipfilter:default because EMI's state is offline svccfg: Not validating instance network/ipsec/ipsecalgs:default because EMI's state is offline svccfg: Not validating instance network/loadbalancer/ilb:default because EMI's state is offline svccfg: Not validating instance smartdc/mdata:execute because EMI's state is offline svccfg: Not validating instance smartdc/mdata:fetch because EMI's state is offline svccfg: Multiple definitions for property value_F5SGK5RPMNXW443PNRSQ----_description_C in property group tm_proppat_nt_ttymon_device. svccfg: Not validating instance system/console-login:vt6 because EMI's state is offline svccfg: Not validating instance system/console-login:vt5 because EMI's state is offline svccfg: Not validating instance system/console-login:vt4 because EMI's state is offline svccfg: Not validating instance system/console-login:vt3 because EMI's state is offline svccfg: Not validating instance system/console-login:vt2 because EMI's state is offline svccfg: Not validating instance system/console-login:ttyb because EMI's state is offline svccfg: Not validating instance system/console-login:ttya because EMI's state is offline svccfg: Not validating instance system/console-login:default because EMI's state is offline svccfg: Loaded 9 smf(5) service descriptions svccfg: Unable to stat file /etc/svc/profile/generic.xml. No such file or directory WARNING: svccfg apply /etc/svc/profile/generic.xml failed svcadm: failed to switch repository: File operation error: (see console) Repository switch back operation failed, please check the system log for the possible fatal error messages. svccfg: Unable to stat file /etc/svc/profile/generic.xml. No such file or directory WARNING: svccfg apply /etc/svc/profile/generic.xml failed svcadm: failed to switch repository: File operation error: (see console) Repository switch back operation failed, please check the system log for the possible fatal error messages.

robinbowes commented 11 years ago

Update:

I took all the disks out of my server, put in a single 500GB drive, rebooted from the same USB key, and re-installed (this is joyent_20130905T204057Z )

I then imported base64 13.2.0 and created a zone using the same json as above.

It worked, ie. the zone created just fine.

I did notice that the new config and my original one are slightly different:

Old:

swap=1.25x
admin_nic=0:25:90:51:f1:81
admin_ip=192.168.1.20
admin_netmask=255.255.255.0
admin_network=...
admin_gateway=192.168.1.1

external_nic=0:25:90:51:f1:80
external0_ip=192.168.1.30
external0_netmask=255.255.255.0
external0_gateway=192.168.1.1

dns_resolvers=192.168.1.53
dns_domain=robinbowes.com

ntp_hosts=ntp.robinbowes.com

compute_node_ntp_hosts=0.centos.pool.ntp.org

default_keymap=uk

config_inc_dir=config.inc
root_authorized_keys_file=authorized_keys

new:

#
# This file was auto-generated and must be source-able by bash.
#

# admin_nic is the nic admin_ip will be connected to for headnode zones.
admin_nic=0:25:90:51:f1:81
admin_ip=192.168.1.20
admin_netmask=255.255.255.0
admin_network=...
admin_gateway=192.168.1.20

headnode_default_gateway=192.168.1.1

dns_resolvers=192.168.1.53,8.8.4.4
dns_domain=robinbowes.com

ntp_hosts=pool.ntp.org
compute_node_ntp_hosts=192.168.1.20

Anything there that is likely to cause problems?

Or is there likely to be some problem with my original zpool?

robinbowes commented 11 years ago

I put the original disks back in and modified the config to be the same as the fresh install.

On rebooting, I have the same problem creating the zone.

So, that suggests there is some problem with my zpool, doesn't it? Any idea where I might look?

robinbowes commented 11 years ago

After much debugging with folk on #smartos, it seems that the problem is caused by having nbmand=on set on the zones pool.

Turning this off fixed this issue and I can now create native zones.

For completeness, here is a zfs get on zones:

[root@nas02 /opt]# zfs get all zones
NAME   PROPERTY              VALUE                                    SOURCE
zones  type                  filesystem                               -
zones  creation              Sat Oct 15 18:19 2011                    -
zones  used                  8.37T                                    -
zones  available             6.73T                                    -
zones  referenced            8.99G                                    -
zones  compressratio         1.02x                                    -
zones  mounted               yes                                      -
zones  quota                 none                                     default
zones  reservation           none                                     default
zones  recordsize            128K                                     default
zones  mountpoint            /zones                                   local
zones  sharenfs              rw=@192.168.1.0/24,root=@192.168.1.0/24  local
zones  checksum              on                                       default
zones  compression           lz4                                      local
zones  atime                 on                                       default
zones  devices               on                                       default
zones  exec                  on                                       default
zones  setuid                on                                       default
zones  readonly              off                                      default
zones  zoned                 off                                      default
zones  snapdir               hidden                                   default
zones  aclmode               discard                                  default
zones  aclinherit            restricted                               default
zones  canmount              on                                       default
zones  xattr                 on                                       default
zones  copies                1                                        default
zones  version               5                                        -
zones  utf8only              off                                      -
zones  normalization         none                                     -
zones  casesensitivity       mixed                                    -
zones  vscan                 off                                      default
zones  nbmand                off                                      local
zones  sharesmb              off                                      default
zones  refquota              none                                     default
zones  refreservation        none                                     default
zones  primarycache          all                                      default
zones  secondarycache        all                                      default
zones  usedbysnapshots       0                                        -
zones  usedbydataset         8.99G                                    -
zones  usedbychildren        8.36T                                    -
zones  usedbyrefreservation  0                                        -
zones  logbias               latency                                  default
zones  dedup                 off                                      local
zones  mlslabel              none                                     default
zones  sync                  standard                                 default
zones  refcompressratio      1.00x                                    -
zones  written               8.99G                                    -
zones  logicalused           8.48T                                    -
zones  logicalreferenced     9.00G                                    -