Open Toshbrown opened 5 years ago
The ARM build uses a base ocaml image across all the ocaml components which is large but this should not change often so could be included into the linux image to help reduce the pull time on this.
It is not pulling the build image, just the final build artefact. Which are bigger than the x86 ones by as much as 3x.
your using resin/aarch64-alpine:3.5 as the base which is 30Mb compared to the 4Mb of arm64v8/alpine:3.5
That change alone will save a significant amount of time!!
You could release an image with the core components already pulled but then you would need to keep it up-to-date, for each release.
I converted the container manager over to use ECC rather than RSA but it looks like the core-network only supports RSA 👎 it uses this (https://github.com/mirleft/ocaml-tls/blob/master/lwt/x509_lwt.ml#L49) to parse the certs
The error I'm getting is this:
2018-11-09 12:54:40 +00:00: WRN [service] while initializing tls: (Failure
"Private key (/run/secrets/DATABOX_NETWORK.pem): X509: failed to parse private key"), roll with http
I also think it may be a wider issue with OCaml support for ECC https://github.com/mirleft/ocaml-tls/issues/362 but this issue is old so things could have moved on.
It is not pulling the build image, just the final build artefact. Which are bigger than the x86 ones by as much as 3x.
your using resin/aarch64-alpine:3.5 as the base which is 30Mb compared to the 4Mb of arm64v8/alpine:3.5
That change alone will save a significant amount of time!!
You could release an image with the core components already pulled but then you would need to keep it up-to-date, for each release.
Yes, I have noticed this also but not sure the reason behind the larger images on ARM. We should probably look at building our own longer term if we have suitable build machines.
Also, I would like to look at simplifying the builds by getting rid of the dependency on older ocaml packages required by ezirmin as currently this is PITA to support across a base ocaml image.
I converted the container manager over to use ECC rather than RSA but it looks like the core-network only supports RSA 👎 it uses this (https://github.com/mirleft/ocaml-tls/blob/master/lwt/x509_lwt.ml#L49) to parse the certs
The error I'm getting is this:
2018-11-09 12:54:40 +00:00: WRN [service] while initializing tls: (Failure "Private key (/run/secrets/DATABOX_NETWORK.pem): X509: failed to parse private key"), roll with http
I also think it may be a wider issue with OCaml support for ECC mirleft/ocaml-tls#362 but this issue is old so things could have moved on.
If we decide to continue with the current design of having a dedicated Export Service it might be worth moving it to Zest which does not have this issue.
Comparison of databox startup time and some benchmarks. The amd64 platform was docker for mac on a MBP (2.7 GHz Intel Core i7) vs arm64v8 a raspberry Pi model 3b+, using @jptmoore 's Alpine image with swap and havaged installed. These are not scientific but I hope it will show us where to focus our efforts to improve performance on Arm64v8.
TLDR: First install of databox on a Pi os 6x slower than on docker for mac! subsequent restarts are 3x slower.
1) We should move to EllipticP256 to improve https certificate generation time 2) Image size and layer count are very important on Pi's all efforts should be made to reduce the size of arm images.
0.5.1 startup time fresh install
test setup:
amd64
Click to see command output
``` [INFO]2018/11/05 10:41:38 Starting Databox 0.5.1 [INFO]2018/11/05 10:41:38 Pulling Image databoxsystems/container-manager-amd64:0.5.1 [INFO]2018/11/05 10:41:41 Done pulling Image databoxsystems/container-manager-amd64:0.5.1 [INFO]2018/11/05 10:41:41 Outputting logs container-manager : [INFO]2018/11/05 10:41:43 GenRootCA called container-manager : [INFO]2018/11/05 10:41:44 Making cert ./certs/databox-network.pem container-manager : [INFO]2018/11/05 10:41:44 Making cert ./certs/export-service.pem container-manager : [INFO]2018/11/05 10:41:44 Making cert ./certs/arbiter.pem container-manager : [INFO]2018/11/05 10:41:44 Making cert ./certs/app-server.pem container-manager : [INFO]2018/11/05 10:41:44 ContainerManager Started container-manager : [INFO]2018/11/05 10:41:44 Starting databox-network container-manager : [INFO]2018/11/05 10:41:44 Pulling Image databoxsystems/core-network-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:41:49 Done pulling Image databoxsystems/core-network-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:41:51 [startCoreNetwork] done starting databox-network container-manager : [INFO]2018/11/05 10:41:51 Pulling Image databoxsystems/core-network-relay-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:41:54 Done pulling Image databoxsystems/core-network-relay-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:41:55 [updateContainerManager] called container-manager : [INFO]2018/11/05 10:41:55 [updateContainerManager] updating dns server container-manager : [INFO]2018/11/05 10:41:55 Restarting the Container Manager container-manager : [INFO]2018/11/05 10:41:59 ContainerManager Started container-manager : [INFO]2018/11/05 10:41:59 [updateContainerManager] called container-manager : [INFO]2018/11/05 10:41:59 [updateContainerManager] updating dns server container-manager : [INFO]2018/11/05 10:41:59 Pulling Image databoxsystems/arbiter-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:42:03 Done pulling Image databoxsystems/arbiter-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:42:12 Pulling Image databoxsystems/export-service-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:42:12 Pulling Image databoxsystems/core-store-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:42:15 Done pulling Image databoxsystems/export-service-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:42:16 Done pulling Image databoxsystems/core-store-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:42:22 Password=ihNO1B9QR1v9s2ojoDxvU48DdpXFs/Mc container-manager : [INFO]2018/11/05 10:42:22 Container Manager Ready and waiting container-manager : [INFO]2018/11/05 10:42:22 Restarting saved apps and drivers container-manager : [INFO]2018/11/05 10:42:22 Installing app-store container-manager : [INFO]2018/11/05 10:42:22 API: registering ServiceStatus container-manager : [INFO]2018/11/05 10:42:22 Installing core-ui container-manager : [INFO]2018/11/05 10:42:29 Pulling Image databoxsystems/driver-app-store-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:42:30 Pulling Image databoxsystems/core-ui-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:42:32 Done pulling Image databoxsystems/driver-app-store-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:42:32 Done pulling Image databoxsystems/core-ui-amd64:0.5.1 container-manager : [INFO]2018/11/05 10:42:33 Successfully installed core-ui container-manager : [INFO]2018/11/05 10:42:39 Successfully installed app-store ```Total time 56 seconds
arm64
Click to see command output
``` container-manager : [INFO]2018/11/05 10:49:15 GenRootCA called container-manager : [INFO]2018/11/05 10:49:59 Making cert ./certs/databox-network.pem container-manager : [INFO]2018/11/05 10:50:13 Making cert ./certs/export-service.pem container-manager : [INFO]2018/11/05 10:50:37 Making cert ./certs/arbiter.pem container-manager : [INFO]2018/11/05 10:51:37 Making cert ./certs/app-server.pem container-manager : [INFO]2018/11/05 10:52:02 ContainerManager Started container-manager : [INFO]2018/11/05 10:52:02 Starting databox-network container-manager : [INFO]2018/11/05 10:52:08 [startCoreNetwork] done starting databox-network container-manager : [INFO]2018/11/05 10:52:11 [updateContainerManager] called container-manager : [INFO]2018/11/05 10:52:11 [updateContainerManager] updating dns server container-manager : [INFO]2018/11/05 10:52:11 Restarting the Container Manager container-manager : [INFO]2018/11/05 10:52:18 ContainerManager Started container-manager : [INFO]2018/11/05 10:52:18 [updateContainerManager] called container-manager : [INFO]2018/11/05 10:52:18 [updateContainerManager] updating dns server container-manager : [INFO]2018/11/05 10:52:43 Pulling Image databoxsystems/export-service-arm64v8:0.5.1 container-manager : [INFO]2018/11/05 10:53:06 Done pulling Image databoxsystems/export-service-arm64v8:0.5.1 container-manager : [INFO]2018/11/05 10:53:08 Pulling Image databoxsystems/core-store-arm64v8:0.5.1 container-manager : [INFO]2018/11/05 10:53:50 Done pulling Image databoxsystems/core-store-arm64v8:0.5.1 container-manager : [INFO]2018/11/05 10:54:01 Password=4pAfzD57q3Uzzy0nGzBfrEz01JGGzjc4 container-manager : [INFO]2018/11/05 10:54:01 API: registering ServiceStatus container-manager : [INFO]2018/11/05 10:54:01 Installing core-ui container-manager : [INFO]2018/11/05 10:54:01 Container Manager Ready and waiting container-manager : [INFO]2018/11/05 10:54:01 Restarting saved apps and drivers container-manager : [INFO]2018/11/05 10:54:01 Installing app-store container-manager : [INFO]2018/11/05 10:54:09 Pulling Image databoxsystems/driver-app-store-arm64v8:0.5.1 container-manager : [INFO]2018/11/05 10:54:12 Pulling Image databoxsystems/core-ui-arm64v8:0.5.1 container-manager : [INFO]2018/11/05 10:54:22 Done pulling Image databoxsystems/core-ui-arm64v8:0.5.1 container-manager : [INFO]2018/11/05 10:54:24 Done pulling Image databoxsystems/driver-app-store-arm64v8:0.5.1 container-manager : [INFO]2018/11/05 10:54:57 Successfully installed core-ui container-manager : [INFO]2018/11/05 10:55:28 Successfully installed app-store ```Total time 373 seconds (arbiter and core network were pre-pulled to avoid race condition)
Creating the 4 HTTPS certs for the core components takes 83 seconds on arm64v8 vs under 1 second on amd64. HTTPS certs are also generated for each driver and app installed so improving this would be a big win.
0.5.1 startup time after first setup
With driver-os-monitor and app-os-monitor started at startup.
databox started once to populate images how long does it take
amd64
Click to see command output
``` container-manager : [INFO]2018/11/05 10:03:52 ContainerManager Started container-manager : [INFO]2018/11/05 10:03:52 Starting databox-network container-manager : [INFO]2018/11/05 10:03:55 [startCoreNetwork] done starting databox-network container-manager : [INFO]2018/11/05 10:03:55 [updateContainerManager] called container-manager : [INFO]2018/11/05 10:03:55 [updateContainerManager] updating dns server container-manager : [INFO]2018/11/05 10:03:55 Restarting the Container Manager container-manager : [INFO]2018/11/05 10:04:00 ContainerManager Started container-manager : [INFO]2018/11/05 10:04:00 [updateContainerManager] called container-manager : [INFO]2018/11/05 10:04:00 [updateContainerManager] updating dns server container-manager : [INFO]2018/11/05 10:04:15 Password=6k+U0A4xebuQaiQPu1oiU8iT60rpJ1jd container-manager : [INFO]2018/11/05 10:04:15 Container Manager Ready and waiting container-manager : [INFO]2018/11/05 10:04:15 Restarting saved apps and drivers container-manager : [INFO]2018/11/05 10:04:15 Installing app-store container-manager : [INFO]2018/11/05 10:04:15 API: registering ServiceStatus container-manager : [INFO]2018/11/05 10:04:15 Installing core-ui container-manager : [INFO]2018/11/05 10:04:15 Installing driver-twitter container-manager : [INFO]2018/11/05 10:04:15 Installing driver-os-monitor container-manager : [INFO]2018/11/05 10:04:22 Successfully installed core-ui container-manager : [INFO]2018/11/05 10:04:29 Successfully installed app-store container-manager : [INFO]2018/11/05 10:04:30 Successfully installed driver-os-monitor container-manager : [INFO]2018/11/05 10:04:31 Successfully installed driver-twitter container-manager : [INFO]2018/11/05 10:04:31 Installing app-os-monitor container-manager : [INFO]2018/11/05 10:04:38 Successfully installed app-os-monitor ```Total time 46 seconds
arm64v8
Click to see command output
``` container-manager : [INFO]2018/11/05 10:03:49 ContainerManager Started container-manager : [INFO]2018/11/05 10:03:49 Starting databox-network container-manager : [INFO]2018/11/05 10:03:56 [startCoreNetwork] done starting databox-network container-manager : [INFO]2018/11/05 10:04:01 [updateContainerManager] called container-manager : [INFO]2018/11/05 10:04:01 [updateContainerManager] updating dns server container-manager : [INFO]2018/11/05 10:04:01 Restarting the Container Manager container-manager : [INFO]2018/11/05 10:04:08 ContainerManager Started container-manager : [INFO]2018/11/05 10:04:08 [updateContainerManager] called container-manager : [INFO]2018/11/05 10:04:08 [updateContainerManager] updating dns server container-manager : [INFO]2018/11/05 10:04:36 Password=3ajK8qiLkLAyRa1UBMruwHG5KUB1BdAM container-manager : [INFO]2018/11/05 10:04:36 Container Manager Ready and waiting container-manager : [INFO]2018/11/05 10:04:36 Restarting saved apps and drivers container-manager : [INFO]2018/11/05 10:04:36 Installing app-store container-manager : [INFO]2018/11/05 10:04:36 Installing core-ui container-manager : [INFO]2018/11/05 10:04:36 API: registering ServiceStatus container-manager : [INFO]2018/11/05 10:04:36 Installing driver-twitter container-manager : [INFO]2018/11/05 10:04:36 Installing driver-os-monitor container-manager : [INFO]2018/11/05 10:05:07 Successfully installed core-ui container-manager : [INFO]2018/11/05 10:05:51 Successfully installed driver-twitter container-manager : [INFO]2018/11/05 10:05:55 Successfully installed driver-os-monitor container-manager : [INFO]2018/11/05 10:05:55 Installing app-os-monitor container-manager : [INFO]2018/11/05 10:06:20 Successfully installed app-store container-manager : [INFO]2018/11/05 10:06:40 Successfully installed app-os-monitor ```Total time 171 seconds
A few benchmarks
A couple of Benchmark results (using golang benchmarking support) looking at https cert generation and a Listing and scanning images.
amd64
arm64v8
GenerateKeyRSA is the slow part of the HTTPs cert generation. This is done for every installed app and driver. Arm platforms are heavily affected by the slow performance of math/bignatmontgomery and math/bigaddMulVVW used in creating RSA keys. GenerateKeyRSA takes 0.18 seconds on amd64 and 9.42 seconds on arm64v8. Moving to EllipticP256 for HTTPS keys would offer a large time-saving.
Pulling images
Using gnu time to look at pulling image form docker hub. A more detailed analysis is tricky as most of the work so done by dockerd and not the docker command.
Images pulled:
Observations: reported download speed by the docker cmd was approximately equal on both platforms at 17 Mbs
darwin amd64
linux arm64
The image pull test on arm64v8 is much slower. These figures don't show the CPU usage of dockerd, but the docker client reports slower layer extraction throughput. If this is CPU or IO limited is difficult to observe, but it is clear that reducing image size and layer count will have a significant impact on startup and app/driver install times on arm64v8.
docker images on the amd64 gives
docker images on the arm64v8 gives
From this its clear that some of the arm64v8 images are larger than necessary. This has a negative impact on startup time due to the slow image extraction step on arm64v8. For example:
The difference between pulling the export-service-arm64v8 (53.2MB) and databoxsystems/export-service-amd64 (18.1MB) on a raspberry Pi is 39 seconds!! As far as I am aware there is no tecnical reason for the differance in image size.