kata-containers / runtime

Kata Containers version 1.x runtime (for version 2.x see https://github.com/kata-containers/kata-containers).
https://katacontainers.io/
Apache License 2.0
2.1k stars 375 forks source link

Kata performance under overlay2 is lower than direct-lvm #717

Closed bekars closed 3 years ago

bekars commented 6 years ago

Description of problem

Kata performance under overlay2 is lower than direct-lvm, and lower than runc.

Expected result

The performance should be same under overlay2 and direct-lvm.

Actual result

Kata performance is bad in QPS test. It isn't good as runc.

Below is the test data:

runtime storage-driver vcpu nginx-workers QPS (ab -n 10000 -c 1000)
runc overlay2 8 8 57000 - 62000
runc direct-lvm 8 8 56000 - 64000
kata overlay2 8 8 4800 - 5800
kata direct-lvm 8 8 11000 - 50000

Test Env

Test Method


kata-collect-data.sh output

Meta details

Running kata-collect-data.sh version 1.2.0 (commit 0bcb32f) at 2018-09-11.16:10:30.331708681+0800.


Runtime is /usr/bin/kata-runtime.

kata-env

Output of "/usr/bin/kata-runtime kata-env":

[Meta]
  Version = "1.0.13"

[Runtime]
  Debug = false
  [Runtime.Version]
    Semver = "1.2.0"
    Commit = "0bcb32f"
    OCI = "1.0.1"
  [Runtime.Config]
    Path = "/usr/share/defaults/kata-containers/configuration.toml"

[Hypervisor]
  MachineType = "pc"
  Version = "QEMU emulator version 2.11.0\nCopyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers"
  Path = "/usr/bin/qemu-lite-system-x86_64"
  BlockDeviceDriver = "virtio-scsi"
  Msize9p = 8192
  Debug = false
  UseVSock = false

[Image]
  Path = "/usr/share/kata-containers/kata-containers-image_clearlinux_1.2.0_agent_fcfa054a757.img"

[Kernel]
  Path = "/usr/share/kata-containers/vmlinuz-4.14.51.10-135.container"
  Parameters = ""

[Initrd]
  Path = ""

[Proxy]
  Type = "kataProxy"
  Version = "kata-proxy version 1.2.0-1796218"
  Path = "/usr/libexec/kata-containers/kata-proxy"
  Debug = false

[Shim]
  Type = "kataShim"
  Version = "kata-shim version 1.2.0-0a37760"
  Path = "/usr/libexec/kata-containers/kata-shim"
  Debug = false

[Agent]
  Type = "kata"

[Host]
  Kernel = "4.4.0-124-generic"
  Architecture = "amd64"
  VMContainerCapable = true
  [Host.Distro]
    Name = "Ubuntu"
    Version = "16.04"
  [Host.CPU]
    Vendor = "GenuineIntel"
    Model = "Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz"

Runtime config files

Runtime default config files

/etc/kata-containers/configuration.toml
/usr/share/defaults/kata-containers/configuration.toml

Runtime config file contents

Config file /etc/kata-containers/configuration.toml not found Output of "cat "/usr/share/defaults/kata-containers/configuration.toml"":

# Copyright (c) 2017-2018 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#

# XXX: WARNING: this file is auto-generated.
# XXX:
# XXX: Source file: "cli/config/configuration.toml.in"
# XXX: Project:
# XXX:   Name: Kata Containers
# XXX:   Type: kata

[hypervisor.qemu]
path = "/usr/bin/qemu-lite-system-x86_64"
kernel = "/usr/share/kata-containers/vmlinuz.container"
image = "/usr/share/kata-containers/kata-containers.img"
machine_type = "pc"

# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
# trouble running pre-2.15 glibc.
#
# WARNING: - any parameter specified here will take priority over the default
# parameter value of the same name used to start the virtual machine.
# Do not set values here unless you understand the impact of doing so as you
# may stop the virtual machine from booting.
# To see the list of default parameters, enable hypervisor debug, create a
# container and look for 'default-kernel-parameters' log entries.
kernel_params = ""

# Path to the firmware.
# If you want that qemu uses the default firmware leave this option empty
firmware = ""

# Machine accelerators
# comma-separated list of machine accelerators to pass to the hypervisor.
# For example, `machine_accelerators = "nosmm,nosmbus,nosata,nopit,static-prt,nofw"`
machine_accelerators=""

# Default number of vCPUs per SB/VM:
# unspecified or 0                --> will be set to 1
# < 0                             --> will be set to the actual number of physical cores
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores      --> will be set to the actual number of physical cores
default_vcpus = 8

# Default maximum number of vCPUs per SB/VM:
# unspecified or == 0             --> will be set to the actual number of physical cores or to the maximum number
#                                     of vCPUs supported by KVM if that number is exceeded
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores      --> will be set to the actual number of physical cores or to the maximum number
#                                     of vCPUs supported by KVM if that number is exceeded
# WARNING: Depending of the architecture, the maximum number of vCPUs supported by KVM is used when
# the actual number of physical cores is greater than it.
# WARNING: Be aware that this value impacts the virtual machine's memory footprint and CPU
# the hotplug functionality. For example, `default_maxvcpus = 240` specifies that until 240 vCPUs
# can be added to a SB/VM, but the memory footprint will be big. Another example, with
# `default_maxvcpus = 8` the memory footprint will be small, but 8 will be the maximum number of
# vCPUs supported by the SB/VM. In general, we recommend that you do not edit this variable,
# unless you know what are you doing.
default_maxvcpus = 0

# Bridges can be used to hot plug devices.
# Limitations:
# * Currently only pci bridges are supported
# * Until 30 devices per bridge can be hot plugged.
# * Until 5 PCI bridges can be cold plugged per VM.
#   This limitation could be a bug in qemu or in the kernel
# Default number of bridges per SB/VM:
# unspecified or 0   --> will be set to 1
# > 1 <= 5           --> will be set to the specified number
# > 5                --> will be set to 5
default_bridges = 1

# Default memory size in MiB for SB/VM.
# If unspecified then it will be set 2048 MiB.
#default_memory = 2048

# Disable block device from being used for a container's rootfs.
# In case of a storage driver like devicemapper where a container's
# root file system is backed by a block device, the block device is passed
# directly to the hypervisor for performance reasons.
# This flag prevents the block device from being passed to the hypervisor,
# 9pfs is used instead to pass the rootfs.
disable_block_device_use = false

# Block storage driver to be used for the hypervisor in case the container
# rootfs is backed by a block device. This is either virtio-scsi or
# virtio-blk.
block_device_driver = "virtio-scsi"

# Enable iothreads (data-plane) to be used. This causes IO to be
# handled in a separate IO thread. This is currently only implemented
# for SCSI.
#
enable_iothreads = false

# Enable pre allocation of VM RAM, default false
# Enabling this will result in lower container density
# as all of the memory will be allocated and locked
# This is useful when you want to reserve all the memory
# upfront or in the cases where you want memory latencies
# to be very predictable
# Default false
#enable_mem_prealloc = true

# Enable huge pages for VM RAM, default false
# Enabling this will result in the VM memory
# being allocated using huge pages.
# This is useful when you want to use vhost-user network
# stacks within the container. This will automatically
# result in memory pre allocation
#enable_hugepages = true

# Enable swap of vm memory. Default false.
# The behaviour is undefined if mem_prealloc is also set to true
#enable_swap = true

# This option changes the default hypervisor and kernel parameters
# to enable debug output where available. This extra output is added
# to the proxy logs, but only when proxy debug is also enabled.
#
# Default false
#enable_debug = true

# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
# behaving as it would when running on bare metal.
#
#disable_nesting_checks = true

# This is the msize used for 9p shares. It is the number of bytes
# used for 9p packet payload.
#msize_9p = 8192

# If true and vsocks are supported, use vsocks to communicate directly
# with the agent and no proxy is started, otherwise use unix
# sockets and start a proxy to communicate with the agent.
# Default false
#use_vsock = true

[factory]
# VM templating support. Once enabled, new VMs are created from template
# using vm cloning. They will share the same initial kernel, initramfs and
# agent memory by mapping it readonly. It helps speeding up new container
# creation and saves a lot of memory if there are many kata containers running
# on the same host.
#
# When disabled, new VMs are created from scratch.
#
# Default false
#enable_template = true

[proxy.kata]
path = "/usr/libexec/kata-containers/kata-proxy"

# If enabled, proxy messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[shim.kata]
path = "/usr/libexec/kata-containers/kata-shim"

# If enabled, shim messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[agent.kata]
# There is no field for this section. The goal is only to be able to
# specify which type of agent the user wants to use.

[runtime]
# If enabled, the runtime will log additional debug messages to the
# system log
# (default: disabled)
#enable_debug = true
#
# Internetworking model
# Determines how the VM should be connected to the
# the container network interface
# Options:
#
#   - bridged
#     Uses a linux bridge to interconnect the container interface to
#     the VM. Works for most cases except macvlan and ipvlan.
#
#   - macvtap
#     Used when the Container network interface can be bridged using
#     macvtap.
internetworking_model="macvtap"

Image details

---
osbuilder:
  url: "https://github.com/kata-containers/osbuilder"
  version: "unknown"
rootfs-creation-time: "2018-08-13T22:51:39.765008919+0000Z"
description: "osbuilder rootfs"
file-format-version: "0.0.2"
architecture: "x86_64"
base-distro:
  name: "Clear"
  version: "24400"
  packages:
    default:
      - "iptables-bin"
      - "libudev0-shim"
      - "systemd"
    extra:

agent:
  url: "https://github.com/kata-containers/agent"
  name: "kata-agent"
  version: "1.2.0-fcfa054a757e7c17afba47b0b4d7e91cbb8688ed"
  agent-is-init-daemon: "no"

Initrd details

No initrd


Logfiles

Runtime logs

Recent runtime problems found in system journal:

time="2018-09-11T15:00:13.366379291+08:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=aaf58bdae536456c5a8f97cb7f6574987f99413e047808d323ff79fa0073fd25 error="open /run/vc/sbs/aaf58bdae536456c5a8f97cb7f6574987f99413e047808d323ff79fa0073fd25/devices.json: no such file or directory" name=kata-runtime pid=9216 sandbox=aaf58bdae536456c5a8f97cb7f6574987f99413e047808d323ff79fa0073fd25 sandboxid=aaf58bdae536456c5a8f97cb7f6574987f99413e047808d323ff79fa0073fd25 source=virtcontainers subsystem=sandbox
time="2018-09-11T15:08:30.961450987+08:00" level=error msg="Container aaf58bdae536456c5a8f97cb7f6574987f99413e047808d323ff79fa0073fd25 not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=aaf58bdae536456c5a8f97cb7f6574987f99413e047808d323ff79fa0073fd25 name=kata-runtime pid=14101 sandbox=aaf58bdae536456c5a8f97cb7f6574987f99413e047808d323ff79fa0073fd25 source=runtime
time="2018-09-11T15:08:31.025449836+08:00" level=error msg="Container aaf58bdae536456c5a8f97cb7f6574987f99413e047808d323ff79fa0073fd25 not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=aaf58bdae536456c5a8f97cb7f6574987f99413e047808d323ff79fa0073fd25 name=kata-runtime pid=14132 sandbox=aaf58bdae536456c5a8f97cb7f6574987f99413e047808d323ff79fa0073fd25 source=runtime
time="2018-09-11T15:08:44.205616457+08:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=ec89f034b5c7890c468a782111e6c1293b5fbfb7a2a1ff914f3f6510cc90b8de error="open /run/vc/sbs/ec89f034b5c7890c468a782111e6c1293b5fbfb7a2a1ff914f3f6510cc90b8de/devices.json: no such file or directory" name=kata-runtime pid=14229 sandbox=ec89f034b5c7890c468a782111e6c1293b5fbfb7a2a1ff914f3f6510cc90b8de sandboxid=ec89f034b5c7890c468a782111e6c1293b5fbfb7a2a1ff914f3f6510cc90b8de source=virtcontainers subsystem=sandbox
time="2018-09-11T16:09:36.29581525+08:00" level=error msg="Container ec89f034b5c7890c468a782111e6c1293b5fbfb7a2a1ff914f3f6510cc90b8de not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=ec89f034b5c7890c468a782111e6c1293b5fbfb7a2a1ff914f3f6510cc90b8de name=kata-runtime pid=15474 sandbox=ec89f034b5c7890c468a782111e6c1293b5fbfb7a2a1ff914f3f6510cc90b8de source=runtime
time="2018-09-11T16:09:36.366499081+08:00" level=error msg="Container ec89f034b5c7890c468a782111e6c1293b5fbfb7a2a1ff914f3f6510cc90b8de not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=ec89f034b5c7890c468a782111e6c1293b5fbfb7a2a1ff914f3f6510cc90b8de name=kata-runtime pid=15505 sandbox=ec89f034b5c7890c468a782111e6c1293b5fbfb7a2a1ff914f3f6510cc90b8de source=runtime

Proxy logs

Recent proxy problems found in system journal:

time="2018-09-11T15:08:30.849406728+08:00" level=fatal msg="failed to handle exit signal" error="close unix @->/run/vc/vm/aaf58bdae536456c5a8f97cb7f6574987f99413e047808d323ff79fa0073fd25/kata.sock: use of closed network connection" name=kata-proxy pid=9293 sandbox=aaf58bdae536456c5a8f97cb7f6574987f99413e047808d323ff79fa0073fd25 source=proxy
time="2018-09-11T16:09:36.115282793+08:00" level=fatal msg="failed to handle exit signal" error="close unix @->/run/vc/vm/ec89f034b5c7890c468a782111e6c1293b5fbfb7a2a1ff914f3f6510cc90b8de/kata.sock: use of closed network connection" name=kata-proxy pid=14298 sandbox=ec89f034b5c7890c468a782111e6c1293b5fbfb7a2a1ff914f3f6510cc90b8de source=proxy

Shim logs

No recent shim problems found in system journal.


Container manager details

Have docker

Docker

Output of "docker version":

Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:24:56 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:23:21 2018
  OS/Arch:          linux/amd64
  Experimental:     false

Output of "docker info":

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 4
Server Version: 18.06.1-ce
Storage Driver: overlay2
 Backing Filesystem: tmpfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: kata-runtime runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-124-generic
Operating System: Ubuntu 16.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 48
Total Memory: 188.8GiB
Name: 100n21
ID: E2VS:TSBG:Z4W6:GNW6:JS4X:YY5S:AAOU:PIYP:2RT3:QN4R:2PXD:7AO6
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Registry Mirrors:
 https://registry.docker-cn.com/
Live Restore Enabled: false

Output of "systemctl show docker":

Type=notify
Restart=on-failure
NotifyAccess=main
RestartUSec=100ms
TimeoutStartUSec=infinity
TimeoutStopUSec=1min 30s
RuntimeMaxUSec=infinity
WatchdogUSec=0
WatchdogTimestamp=Tue 2018-09-11 14:59:20 CST
WatchdogTimestampMonotonic=2990659678
FailureAction=none
PermissionsStartOnly=no
RootDirectoryStartOnly=no
RemainAfterExit=no
GuessMainPID=yes
MainPID=8895
ControlPID=0
FileDescriptorStoreMax=0
NFileDescriptorStore=0
StatusErrno=0
Result=success
ExecMainStartTimestamp=Tue 2018-09-11 14:59:19 CST
ExecMainStartTimestampMonotonic=2990308201
ExecMainExitTimestampMonotonic=0
ExecMainPID=8895
ExecMainCode=0
ExecMainStatus=0
ExecStart={ path=/usr/bin/dockerd ; argv[]=/usr/bin/dockerd -H fd:// ; ignore_errors=no ; start_time=[Tue 2018-09-11 14:59:19 CST] ; stop_time=[n/a] ; pid=8895 ; code=(null) ; status=0/0 }
ExecReload={ path=/bin/kill ; argv[]=/bin/kill -s HUP $MAINPID ; ignore_errors=no ; start_time=[n/a] ; stop_time=[n/a] ; pid=0 ; code=(null) ; status=0/0 }
Slice=system.slice
ControlGroup=/system.slice/docker.service
MemoryCurrent=261222400
CPUUsageNSec=1513354066478
TasksCurrent=139
Delegate=yes
CPUAccounting=no
CPUShares=18446744073709551615
StartupCPUShares=18446744073709551615
CPUQuotaPerSecUSec=infinity
BlockIOAccounting=no
BlockIOWeight=18446744073709551615
StartupBlockIOWeight=18446744073709551615
MemoryAccounting=no
MemoryLimit=18446744073709551615
DevicePolicy=auto
TasksAccounting=no
TasksMax=18446744073709551615
EnvironmentFile=/etc/default/docker (ignore_errors=no)
UMask=0022
LimitCPU=18446744073709551615
LimitCPUSoft=18446744073709551615
LimitFSIZE=18446744073709551615
LimitFSIZESoft=18446744073709551615
LimitDATA=18446744073709551615
LimitDATASoft=18446744073709551615
LimitSTACK=18446744073709551615
LimitSTACKSoft=8388608
LimitCORE=18446744073709551615
LimitCORESoft=18446744073709551615
LimitRSS=18446744073709551615
LimitRSSSoft=18446744073709551615
LimitNOFILE=1048576
LimitNOFILESoft=1048576
LimitAS=18446744073709551615
LimitASSoft=18446744073709551615
LimitNPROC=18446744073709551615
LimitNPROCSoft=18446744073709551615
LimitMEMLOCK=65536
LimitMEMLOCKSoft=65536
LimitLOCKS=18446744073709551615
LimitLOCKSSoft=18446744073709551615
LimitSIGPENDING=771371
LimitSIGPENDINGSoft=771371
LimitMSGQUEUE=819200
LimitMSGQUEUESoft=819200
LimitNICE=0
LimitNICESoft=0
LimitRTPRIO=0
LimitRTPRIOSoft=0
LimitRTTIME=18446744073709551615
LimitRTTIMESoft=18446744073709551615
OOMScoreAdjust=0
Nice=0
IOScheduling=4
CPUSchedulingPolicy=0
CPUSchedulingPriority=0
TimerSlackNSec=50000
CPUSchedulingResetOnFork=no
NonBlocking=no
StandardInput=null
StandardOutput=journal
StandardError=inherit
TTYReset=no
TTYVHangup=no
TTYVTDisallocate=no
SyslogPriority=30
SyslogLevelPrefix=yes
SyslogLevel=6
SyslogFacility=3
SecureBits=0
CapabilityBoundingSet=18446744073709551615
AmbientCapabilities=0
MountFlags=0
PrivateTmp=no
PrivateNetwork=no
PrivateDevices=no
ProtectHome=no
ProtectSystem=no
SameProcessGroup=no
UtmpMode=init
IgnoreSIGPIPE=yes
NoNewPrivileges=no
SystemCallErrorNumber=0
RuntimeDirectoryMode=0755
KillMode=process
KillSignal=15
SendSIGKILL=yes
SendSIGHUP=no
Id=docker.service
Names=docker.service
Requires=docker.socket sysinit.target system.slice
Wants=network-online.target
ConsistsOf=docker.socket
Conflicts=shutdown.target
Before=shutdown.target
After=sysinit.target systemd-journald.socket basic.target system.slice docker.socket firewalld.service network-online.target
TriggeredBy=docker.socket
Documentation=https://docs.docker.com
Description=Docker Application Container Engine
LoadState=loaded
ActiveState=active
SubState=running
FragmentPath=/lib/systemd/system/docker.service
UnitFileState=disabled
UnitFilePreset=enabled
StateChangeTimestamp=Tue 2018-09-11 14:59:20 CST
StateChangeTimestampMonotonic=2990659680
InactiveExitTimestamp=Tue 2018-09-11 14:59:19 CST
InactiveExitTimestampMonotonic=2990308261
ActiveEnterTimestamp=Tue 2018-09-11 14:59:20 CST
ActiveEnterTimestampMonotonic=2990659680
ActiveExitTimestamp=Tue 2018-09-11 14:59:18 CST
ActiveExitTimestampMonotonic=2989274413
InactiveEnterTimestamp=Tue 2018-09-11 14:59:19 CST
InactiveEnterTimestampMonotonic=2990284598
CanStart=yes
CanStop=yes
CanReload=yes
CanIsolate=no
StopWhenUnneeded=no
RefuseManualStart=no
RefuseManualStop=no
AllowIsolate=no
DefaultDependencies=yes
OnFailureJobMode=replace
IgnoreOnIsolate=no
NeedDaemonReload=no
JobTimeoutUSec=infinity
JobTimeoutAction=none
ConditionResult=yes
AssertResult=yes
ConditionTimestamp=Tue 2018-09-11 14:59:19 CST
ConditionTimestampMonotonic=2990290749
AssertTimestamp=Tue 2018-09-11 14:59:19 CST
AssertTimestampMonotonic=2990290749
Transient=no
StartLimitInterval=60000000
StartLimitBurst=3
StartLimitAction=none

No kubectl


Packages

Have dpkg Output of "dpkg -l|egrep "(cc-oci-runtimecc-runtimerunv|kata-proxy|kata-runtime|kata-shim|kata-containers-image|linux-container|qemu-)"":

ii  kata-containers-image               1.2.0-32                              amd64        Kata containers image
ii  kata-linux-container                4.14.51.10-135                        amd64        linux kernel optimised for container-like workloads.
ii  kata-proxy                          1.2.0+git.1796218-32                  amd64
ii  kata-runtime                        1.2.0+git.0bcb32f-47                  amd64
ii  kata-shim                           1.2.0+git.0a37760-33                  amd64
ii  qemu-lite                           2.11.0+git.a39e0b3e82-48              amd64        linux kernel optimised for container-like workloads.
ii  qemu-vanilla                        2.11.2+git.0982a56a55-46              amd64        linux kernel optimised for container-like workloads.

No rpm

grahamwhaley commented 6 years ago

Hi @bekars . I think this is something @egernst looked into very recently, and found the interesting conclusion that how overlay2 affects nginx (which is after all a network test), is because the file served up by nginx (index.html??), is stored on the overlayed container FS - so, in the case of overlay2, that is backed by 9pfs, and the index.html is being fetched for every request over 9pfs, which is not by default caching. When using runc or devicemapper the file will be cached in the pagecache, so that overhead goes away.

I'm not sure there is an easy fix for the 9pfs itself - but, for the nginx test, to make it neutral of the backing filesystem, maybe we can get the nginx container setup to move the index.html into a tmpfs/ramfs?

I think @egernst may have had some other info around configuring nginx and ab (or alternatively maybe hey) to better measure performance under Kata. I'll leave that for @egernst now...

/cc @amshinde as well I think.

egernst commented 6 years ago

Hey @bekars - as @grahamwhaley summarized - index.html lives in the rootfs itself and isn't cached per se. So, each time nginx prices this, it accesses it through the file system. If it is backed by virtio-scsi (as is the case when you use direct-lvm), there is limited overhead. If it is backed by 9pfs (as is the case with overlay), then this introduces a considerable overhead, providing much worse performance. None of this has to do with network, as the bottleneck ends up becoming 9p.

I provided some of this feedback on threads to Baidu in the past -- does the analysis above make sense? I think this is ultimately a duplicate of issue #551

See the following issue for more details: https://github.com/kata-containers/runtime/issues/551#issuecomment-411121324

bekars commented 6 years ago

Thanks @egernst. Overlay2 is default on ubuntu. Do we have any plan to improve 9p performance here? Using cache?

https://www.kernel.org/doc/Documentation/filesystems/9p.txt

  cache=mode    specifies a caching policy.  By default, no caches are used.
                        none = default no cache policy, metadata and data
                                alike are synchronous.
            loose = no attempts are made at consistency,
                                intended for exclusive, read-only mounts
                        fscache = use FS-Cache for a persistent, read-only
                cache backend.
                        mmap = minimal cache that is only used for read-write
                                mmap.  Northing else is cached, like cache=none
zhiminghufighting commented 5 years ago

@clarklee92 This issue was discussed by us on webchat and currently community are considering use virtio-fs to replace 9p as the sharing between host and guest. It is under evaluation now.