robur-coop / miragevpn

An opinionated implementation of the OpenVPN protocol
BSD 2-Clause "Simplified" License
78 stars 9 forks source link

Download speeds vs OpenVPN tracking issue #206

Closed jwijenbergh closed 6 months ago

jwijenbergh commented 8 months ago

Setup

OS: Linux, NixOS unstable (24.05), Kernel: 6.1.72

MirageVPN version:

master commit 22d0104b83a716192f4d9737bf3a0615fec5b384

Dune version: 3.12.1

ocaml version: The OCaml toplevel, version 4.14.1

Note: I will try to build with ocaml 5

steps to build:

eval (opam env)
dune build --release

run:

sudo _build/install/default/bin/miragevpn-client-lwt ~/Downloads/tuxed.ovpn

ensure the routes exist (these routes are added by the openvpn cli for me):

sudo ip route add 0.0.0.0/1 via gatewayipvpn dev tun0
sudo ip route add 128.0.0.0/1 via gatewayipvpn dev tun0
sudo ip route add default via gatewayipvpn dev tun0
sudo ip route add 145.220.52.76 via defaultgateway dev eth0

openvpn cli:

OpenVPN 2.6.8 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD] [DCO]
library versions: OpenSSL 3.0.12 24 Oct 2023, LZO 2.10
DCO version: N/A
Originally developed by James Yonan
Copyright (C) 2002-2023 OpenVPN Inc <sales@openvpn.net>
Compile time defines: enable_async_push=no enable_comp_stub=no enable_crypto_ofb_cfb=yes enable_dco=auto enable_dco_arg=auto enable_debug=yes enable_dependency_tracking=no enable_dlopen=unknown enable_dlopen_self=unknown enable_dlopen_self_static=unknown enable_fast_install=yes enable_fragment=yes enable_iproute2=no enable_libtool_lock=yes enable_lz4=yes enable_lzo=yes enable_management=yes enable_pam_dlopen=no enable_pedantic=no enable_pkcs11=no enable_plugin_auth_pam=yes enable_plugin_down_root=yes enable_plugins=yes enable_port_share=yes enable_selinux=no enable_shared=yes enable_shared_with_static_runtimes=no enable_small=no enable_static=no enable_strict=no enable_strict_options=no enable_systemd=yes enable_werror=no enable_win32_dll=yes enable_wolfssl_options_h=yes enable_x509_alt_username=no with_aix_soname=aix with_crypto_library=openssl with_gnu_ld=yes with_mem_check=no with_openssl_engine=auto with_sysroot=no

tests

Take these with a grain of salt of course (these benchmarks are not very scientific), but I do not expect such a large difference. I ran this multiple times and got similar results.

no VPN

image

VPN

server: vpn.tuxed.net default profile

connection type: wired

udp tests:

miragevpn:

image

openvpn:

image

tcp tests:

miragevpn:

image

openvpn:

image

quick tests with wireless gave me similar results for miragevpn and 50 for openvpn, which seems to indicate that there is a bottleneck in miragevpn somewhere (?)

hannesm commented 8 months ago

Thanks for your report. Would you mind to take a brief look if there's a variance if you specify "data-ciphers AES-256-GCM" explicitly? May it be that MirageVPN uses poly1305-chacha20 (in a non-accelerated version), while OpenVPN uses AES-GCM?

Certainly, you can as well look in the server log output which cipher(s) are used.

hannesm commented 8 months ago

Note: I will try to build with ocaml 5

My expectation is that it will be slightly slower, and not faster.

jwijenbergh commented 8 months ago

Thanks for your report. Would you mind to take a brief look if there's a variance if you specify "data-ciphers AES-256-GCM" explicitly? May it be that MirageVPN uses poly1305-chacha20 (in a non-accelerated version), while OpenVPN uses AES-GCM?

Unfortunately, this doesn't make a difference for me

jwijenbergh commented 8 months ago

is there a way to obtain a flamegraph maybe?

hannesm commented 8 months ago

is there a way to obtain a flamegraph maybe?

I get you're on macOS, any common profiling utility (Instruments?) should work. Would be great if you could post a screenshot if you manage to gather profiling information. On Linux, perf should work just fine.

jwijenbergh commented 8 months ago

I am using linux and perf just segfaults for me. I am trying landmarks (https://tsong.co/blog/profiling-ocaml-quick-dirty/) but that only generates a report after the binary closes (ctrl+c) doesn't work. Is there any way to gracefully close miragevpn after e.g. x seconds?

hannesm commented 8 months ago

I am using linux and perf just segfaults for me. I am trying landmarks (https://tsong.co/blog/profiling-ocaml-quick-dirty/) but that only generates a report after the binary closes (ctrl+c) doesn't work. Is there any way to gracefully close miragevpn after e.g. x seconds?

Thanks for trying. You can apply the following patch to exit after 10 seconds (or anything else if you modify the 10.):

diff --git a/app/miragevpn_client_lwt.ml b/app/miragevpn_client_lwt.ml
index fc249f9..8b399b2 100644
--- a/app/miragevpn_client_lwt.ml
+++ b/app/miragevpn_client_lwt.ml
@@ -391,7 +391,9 @@ let jump _ filename pkcs12 =
   Lwt_main.run
     (parse_config filename >>= function
      | Error (`Msg s) -> failwith ("config parser: " ^ s)
-     | Ok config -> establish_tunnel config pkcs12)
+     | Ok config ->
+       Lwt.pick [ (Lwt_unix.sleep 10. >|= fun () -> Ok ()) ;
+                  establish_tunnel config pkcs12 ])

 let reporter_with_ts ~dst () =
   let pp_tags f tags =
palainp commented 8 months ago

Hi @jwijenbergh just in case it falls in a similar issue I had with another unikernel, would you mind to try the usual openvpn client with TCP Segmentation Offload desactivated (sudo ethtool -K eth0 tso off )?

jwijenbergh commented 8 months ago

Hi, sorry for taking so long was busy with work.

I am using linux and perf just segfaults for me. I am trying landmarks (https://tsong.co/blog/profiling-ocaml-quick-dirty/) but that only generates a report after the binary closes (ctrl+c) doesn't work. Is there any way to gracefully close miragevpn after e.g. x seconds?

Thanks for trying. You can apply the following patch to exit after 10 seconds (or anything else if you modify the 10.):

diff --git a/app/miragevpn_client_lwt.ml b/app/miragevpn_client_lwt.ml
index fc249f9..8b399b2 100644
--- a/app/miragevpn_client_lwt.ml
+++ b/app/miragevpn_client_lwt.ml
@@ -391,7 +391,9 @@ let jump _ filename pkcs12 =
   Lwt_main.run
     (parse_config filename >>= function
      | Error (`Msg s) -> failwith ("config parser: " ^ s)
-     | Ok config -> establish_tunnel config pkcs12)
+     | Ok config ->
+       Lwt.pick [ (Lwt_unix.sleep 10. >|= fun () -> Ok ()) ;
+                  establish_tunnel config pkcs12 ])

 let reporter_with_ts ~dst () =
   let pp_tags f tags =

Thanks, that worked. Not sure how useful this profile is though:

miragevpnprofile.json

Upload the file at: https://lexifi.github.io/landmarks/viewer.html

Hi @jwijenbergh just in case it falls in a similar issue I had with another unikernel, would you mind to try the usual openvpn client with TCP Segmentation Offload desactivated (sudo ethtool -K eth0 tso off )?

Thanks for the suggestion. Unfortunately, that gives the same speeds with the official openvpn CLI. I think the setting was already 'off' as putting it 'on' again doesn't work

palainp commented 8 months ago

Thanks for the interesting landmark file, sadly there is no data below the miragevpn code, and it'd be useful to have some insight in the mirage-crypto part (as it seems that most of the time is spent there, which makes sense :smiley: ). I'm not sure if there is a best/easier path but if you don't mind, you can probably do something like:

git clone https://github.com/mirage/mirage-crypto.git
cd mirage-crypto
#edit src/dune to add the landmark reference as in your quick & dirty link
opam pin mirage-crypto . && opam upgrade mirage-crypto
cd ../miragevpn
dune clean && dune build --release

Unfortunately I don't have an openvpn server to get the .json file, but if you manage to have more details on the crypto part it'd be interesting. After that test you can freely opam unpin mirage-crypto to avoid issues with opam updates.

hannesm commented 8 months ago

Thanks for the output @jwijenbergh.

FWIW @palainp here we're testing the lwt+unix application (not the standalone unikernel), thus the OCaml TCP stack is not used. EDIT: this means the TCP checksum offloading is expected to not have any effect.

jwijenbergh commented 7 months ago

Note: my distro updated to ocaml 5 but the performance numbers are still low, maybe slightly better.

I managed to get perf to work, see the following flamegraph (again, not sure how useful it is):

perf-flamegraph

I will follow up later with the updated landmark one

hannesm commented 7 months ago

@jwijenbergh great to see that perf works for you. The flamegraph shows hexdump_pp, and I'm curious whether you're at the latest commit of this repository (since https://github.com/robur-coop/miragevpn/commit/10387eb9d97b5960c33d698346e4f18ef554e6cf should avoid that hexdump being present)?

jwijenbergh commented 7 months ago

@jwijenbergh great to see that perf works for you. The flamegraph shows hexdump_pp, and I'm curious whether you're at the latest commit of this repository (since 10387eb should avoid that hexdump being present)?

Ah apologies, I am missing only that commit. Will re-test when I get back to the same setup on monday

hannesm commented 7 months ago

I looked a bit down the code, and found out that our AES-GCM implementation (using AESNI CPU instructions) is ~10x slower than OpenSSL. I could get to "only" 3x slower [by using different kind of byte arrays which may move (due to GC), but don't need to be externally (malloc) allocated]. I put that endeavor aside for now, since (a) getting it upstreamed will break quite some API [I'm not concerned about the API breaking, more about the work required to push that change through], and much more (b) how to recover the last factor (of 3)?

I installed Ubuntu on a spare machine to gather some statistics with perf. I wonder whether I should dive deeper into assembly to figure out what the bottleneck is. I do not quite understand the flamegraph I generate, since there's (similar to the space in yours on the left above __printf_buffer) quite a large space without any symbol -- but the function below doesn't do any hard work (sets up something and calls into C)... I guess understanding that gap is required to move forward.

hannesm commented 7 months ago

Further investigations shows that computing the hash (GHASH) slows it down significantly. Also, removing the functors and putting it top-level makes AES-GCM faster (with 16 byte blocks)

reynir commented 6 months ago

We have spent some time on optimizing Miragevpn (still a few things to do), and I have been able to observe great improvements. Things done include reducing the number of allocations and copying done in Miragevpn as well as computing hmacs on the packet buffer instead of reconstructing the relevant parts of the packet in a separate buffer when encoding packets (the same change for decoding packets is still TODO). In the end what improved performance the most was delaying error message formatting similar to the hexdump_pp issue earlier.

In the below pad is some data I collected on a host with (mostly idle) dedicated hardware. I set up a virtual machine that runs an openvpn server and an iperf3 server. Then in the host I connected to the VM's VPN with OpenVPN and various Miragevpn versions and conducted iperf3 tests against the VM through the VPN.

Key numbers are:

Direction 22d0104 (originally tested) 45ccfae (main) OpenVPN
Upload 211 Mb/s 378 Mb/s 622 Mb/s
Download 27.9 Mb/s 535 Mb/s 646 Mb/s

This means the upload speed of 22d0104 is ~33% of that of OpenVPN and the download speed is <5% that of OpenVPN. With the recent changes in 45ccfae the upload speed is now ~60% of that of OpenVPN and the download speed is now ~80% of that of OpenVPN.

https://pad.data.coop/HKf888h5Rm-oxvr0lj_tJw?view#OpenVPN-running-in-a-VM-on-the-same-machine

@jwijenbergh if you could please test again against main and report whether the numbers are close to OpenVPN or not that would be great :-)

jwijenbergh commented 6 months ago

Sorry for my radio silence on this. I will be at the exact same setup again next week so I will test the latest commits of everything when I get the chance

jwijenbergh commented 6 months ago

Some quick tests at commit 45ccfaece30421a80b8555ca2eab5ee38a7ba897:

UDP:

OpenVPN: image

MirageVPN: image

TCP:

OpenVPN: image

MirageVPN: image

This is a HUGE improvement compared to the initial test I did. Great work! I feel like we should close this as it's within reasonable margins and my way of testing is not very scientific to say the least. Feel free to re-open to track this further