Closed vitcozzolino closed 2 years ago
Same issue on Dell OptiPlex 7050, this time I tried with the network.ml unikernel. Log follows:
Parsing config from network.xl
MirageOS booting...
Initialising timer interface
Initialising console ... done.
Attempt to open(/dev/urandom)!
Unsupported function getpid called in Mini-OS kernel
Unsupported function getppid called in Mini-OS kernel
2018-01-26 11:41:27 -00:00: INF [net-xen:frontend] connect 0
2018-01-26 11:41:27 -00:00: INF [net-xen:frontend] create: id=0 domid=0
2018-01-26 11:41:27 -00:00: INF [net-xen:frontend] sg:true gso_tcpv4:true rx_copy:true rx_flip:false smart_poll:false
2018-01-26 11:41:27 -00:00: INF [net-xen:frontend] MAC: 00:16:3e:0b:ff:ed
2018-01-26 11:41:27 -00:00: INF [ethif] Connected Ethernet interface 00:16:3e:0b:ff:ed
2018-01-26 11:41:27 -00:00: INF [arpv4] Connected arpv4 device on 00:16:3e:0b:ff:ed
2018-01-26 11:41:27 -00:00: INF [udp] UDP interface connected on 131.159.24.190
2018-01-26 11:41:27 -00:00: INF [tcpip-stack-direct] stack assembled: mac=00:16:3e:0b:ff:ed,ip=131.159.24.190
Page fault at linear address 28, rip 101667, regs 00000000002af898, sp 2af940, our_sp 00000000002af860, code 0
RIP: e030:[<0000000000101667>]
RSP: e02b:00000000002af940 EFLAGS: 00010002
RAX: 0000000000101660 RBX: 000000000083b208 RCX: 0000000000002001
RDX: 000000000000001d RSI: 000000000083b150 RDI: 000000000083b098
RBP: 000000000083b528 R08: 000000000083b208 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000257668 R12: 0000000000056e00
R13: 0000000000000001 R14: 00000000002afba0 R15: 000000000083b090
base is 0x83b528 caller is 0x83b540
base is 0x83b900 caller is 0x400
base is 0xb69 Page fault in pagetable walk (access to invalid memory?).
On this machine I've tried with Ocaml v4.04.2 and v4.05.0. This is the output of opam list:
# Installed packages for 4.05.0:
astring 0.8.3 Alternative String module for OCaml
base v0.10.0 Full standard library replacement for OCaml
base-bigarray base Bigarray library distributed with the OCaml compiler
base-bytes base Bytes library distributed with the OCaml compiler
base-num base Num library distributed with the OCaml compiler
base-threads base Threads library distributed with the OCaml compiler
base-unix base Unix library distributed with the OCaml compiler
bos 0.2.0 Basic OS interaction for OCaml
cmdliner 1.0.2 Declarative definition of command line interfaces for OCaml
conf-m4 1 Virtual package relying on m4
conf-perl 1 Virtual package relying on perl
conf-pkg-config 1.0 Virtual package relying on pkg-config installation.
configurator v0.10.0 Helper library for gathering system configuration
cppo 1.6.0 Equivalent of the C preprocessor for OCaml programs
cstruct 3.2.1 Access C-like structures directly from OCaml
cstruct-lwt 3.2.1 Access C-like structures directly from OCaml
depext 1.0.5 Query and install external dependencies of OPAM packages
duration 0.1.1 Conversions to various time units
fmt 0.8.5 OCaml Format pretty-printer combinators
fpath 0.7.2 File system paths for OCaml
functoria 2.2.0 A DSL to organize functor applications
functoria-runtime 2.2.0 A DSL to organize functor applications
io-page 2.0.1 Allocate memory pages suitable for aligned I/O
io-page-unix 2.0.1 Allocate memory pages suitable for aligned I/O
io-page-xen 2.0.1 Allocate memory pages suitable for aligned I/O
ipaddr 2.8.0 IP (and MAC) address manipulation
jbuilder 1.0+beta16 Fast, portable and opinionated build system
logs 0.6.2 Logging infrastructure for OCaml
lwt 3.2.1 Promises, concurrency, and parallelized I/O
minios-xen 0.9 A minimal OS for running under the Xen hypervisor
mirage 3.0.8 The MirageOS library operating system
mirage-block 1.1.0 Utilities and module definitions for dealing with block devices.
mirage-block-lwt 1.1.0 Utilities and module definitions for dealing with block devices.
mirage-bootvar-xen 0.5.0 Library for reading MirageOS unikernel boot parameters in Xen
mirage-channel 3.1.0 Buffered channels for MirageOS FLOW types
mirage-channel-lwt 3.1.0 Buffered channels for MirageOS FLOW types
mirage-clock 1.3.0 Libraries and module types for portable clocks
mirage-clock-freestanding 1.3.0 Libraries and module types for portable clocks
mirage-clock-lwt 1.3.0 Libraries and module types for portable clocks
mirage-console 2.3.5 Implementations of Mirage consoles, for Unix and Xen
mirage-console-lwt 2.3.5 Implementations of Mirage consoles, for Unix and Xen
mirage-device 1.1.0 Foundational module types for devices.
mirage-flow 1.3.0 Flow implementations and combinators for MirageOS
mirage-flow-lwt 1.4.0 Flow implementations and combinators for MirageOS using Lwt
mirage-fs 1.1.1 MirageOS signatures for filesystem devices
mirage-fs-lwt 1.1.1 MirageOS signatures for filesystem devices
mirage-kv 1.1.1 MirageOS signatures for key/value devices
mirage-kv-lwt 1.1.0 MirageOS utilities for interfacing with key-value stores.
mirage-logs 0.3.0 A reporter for the Logs library that writes log messages to stderr, using a Mirage `CLOCK` to add timestamps.
mirage-net 1.1.1 Network signatures for MirageOS
mirage-net-lwt 1.1.0 MirageOS TCP/IP networking library
mirage-net-xen 1.7.1 Ethernet network device driver for MirageOS/Xen
mirage-profile 0.8.2 Collect runtime profiling information in CTF format
mirage-protocols 1.2.0 MirageOS signatures for network protocols
mirage-protocols-lwt 1.2.0 MirageOS signatures for network protocols
mirage-random 1.1.0 Random signatures for MirageOS, and an implementation using stdlib
mirage-runtime 3.0.7 A bundle of useful runtime functions for applications built with Mirage
mirage-stack 1.1.0 MirageOS signatures for network stacks
mirage-stack-lwt 1.1.0 MirageOS signatures for network stacks
mirage-time 1.1.0 Time operations for MirageOS
mirage-time-lwt 1.1.0 Time operations for MirageOS
mirage-types 3.0.7 Module type definitions for Mirage-compatible applications
mirage-types-lwt 3.0.7 Lwt module type definitions for Mirage-compatible applications
mirage-xen 3.0.4 MirageOS library for Xen compilation
mirage-xen-minios 0.9.3 Xen MiniOS guest operating system library
mirage-xen-ocaml 3.0.5 MirageOS headers for the OCaml runtime
mirage-xen-posix 3.0.4 MirageOS library for posix headers
netchannel 1.8.0 Ethernet network device driver for MirageOS/Xen
num 0 The Num library for arbitrary-precision integer and rational arithmetic
ocaml-compiler-libs v0.10.0 OCaml compiler libraries repackaged
ocaml-migrate-parsetree 1.0.7 Convert OCaml parsetrees between different versions
ocaml-src 4.05.0 Compiler sources
ocamlbuild 0.12.0 OCamlbuild is a build system with builtin rules to easily build most OCaml projects.
ocamlfind 1.7.3-1 A library manager for OCaml
ocamlgraph 1.8.8 A generic graph library for OCaml
ocplib-endian 1.0 Optimised functions to read and write int16/32/64 from strings and bigarrays, based on new primitives added in
parse-argv 0.1.0 Process strings into sets of command-line arguments
ppx_ast v0.10.0 OCaml AST used by Jane Street ppx rewriters
ppx_core v0.10.0 Standard library for ppx rewriters
ppx_cstruct 3.2.1 Access C-like structures directly from OCaml
ppx_derivers 1.0 Shared [@@deriving] plugin registry
ppx_driver v0.10.2 Feature-full driver for OCaml AST transformers
ppx_metaquot v0.10.0 Write OCaml AST fragment using OCaml syntax
ppx_optcomp v0.10.0 Optional compilation for OCaml
ppx_sexp_conv v0.10.0 Generation of S-expression conversion functions from type definitions
ppx_tools 5.0+4.05.0 Tools for authors of ppx rewriters and other syntactic tools
ppx_tools_versioned 5.1 A variant of ppx_tools based on ocaml-migrate-parsetree
ppx_traverse_builtins v0.10.0 Builtins for Ppx_traverse
ppx_type_conv v0.10.0 Support Library for type-driven code generators
ptime 0.8.3 POSIX time for OCaml
randomconv 0.1.0 Convert from random bytes to random native numbers
result 1.2 Compatibility Result module
rresult 0.5.0 Result value combinators for OCaml
sexplib v0.10.0 Library for serializing OCaml values to and from S-expressions
shared-memory-ring 3.0.0 Shared memory rings for RPC and bytestream communications.
shared-memory-ring-lwt 3.0.0 Shared memory rings for RPC and bytestream communications.
stdio v0.10.0 Standard IO library for OCaml
tcpip 3.3.1 An OCaml TCP/IP networking stack
topkg 0.9.1 The transitory OCaml software packager
uchar 0.0.2 Compatibility library for OCaml's Uchar module
xen-evtchn 2.0.0 Xen event channel bindings.
xen-gnt 3.0.1 Grant table bindings for OCaml.
xenstore 2.0.0 Xenstore protocol clients and server
Output of sudo xl info
:
host : opti02
release : 4.4.0-112-generic
version : #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018
machine : x86_64
nr_cpus : 4
max_cpu_id : 3
nr_nodes : 1
cores_per_socket : 4
threads_per_core : 1
cpu_mhz : 2712
hw_caps : bfebfbff:2c100800:00000000:00007f00:77fafbff:00000000:00000121:029c6fbf
virt_caps : hvm hvm_directio
total_memory : 8076
free_memory : 34
sharing_freed_memory : 0
sharing_used_memory : 0
outstanding_claims : 0
free_cpus : 0
xen_major : 4
xen_minor : 6
xen_extra : .5
xen_version : 4.6.5
xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler : credit
xen_pagesize : 4096
platform_params : virt_start=0xffff800000000000
xen_changeset :
xen_commandline : placeholder no-real-mode edd=off
cc_compiler : gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
cc_compile_by : stefan.bader
cc_compile_domain : canonical.com
cc_compile_date : Fri Oct 13 15:42:52 UTC 2017
xend_config_format : 4
Might it be related with some Ubuntu Linux security features?
I guess I have same issue with device-usage/network. I can run a xen VM, and I can arping it; but a single ICMP echo-request (or TCP syn) will crash it. Output from ping is below.
mirage configure -t xen && make depend && make -j4
xl create network.xl -c
Parsing config from network.xl
MirageOS booting...
Initialising timer interface
Initialising console ... done.
Attempt to open(/dev/urandom)!
Unsupported function getpid called in Mini-OS kernel
Unsupported function getppid called in Mini-OS kernel
2018-03-07 16:08:22 -00:00: INF [net-xen:frontend] connect 0
2018-03-07 16:08:22 -00:00: INF [net-xen:frontend] create: id=0 domid=0
2018-03-07 16:08:22 -00:00: INF [net-xen:frontend] sg:true gso_tcpv4:true rx_copy:true rx_flip:false smart_poll:false
2018-03-07 16:08:22 -00:00: INF [net-xen:frontend] MAC: 00:16:3e:6f:8e:7e
2018-03-07 16:08:22 -00:00: INF [ethif] Connected Ethernet interface 00:16:3e:6f:8e:7e
2018-03-07 16:08:22 -00:00: INF [arpv4] Connected arpv4 device on 00:16:3e:6f:8e:7e
2018-03-07 16:08:22 -00:00: INF [udp] UDP interface connected on 10.0.0.2
2018-03-07 16:08:22 -00:00: INF [tcpip-stack-direct] stack assembled: mac=00:16:3e:6f:8e:7e,ip=10.0.0.2
Page fault at linear address 28, rip fcca1, regs 00000000002af988, sp 2afa30, our_sp 00000000002af950, code 0
RIP: e030:[<00000000000fcca1>]
RSP: e02b:00000000002afa30 EFLAGS: 00010006
RAX: 00000000000fcc90 RBX: 00000000008b6508 RCX: 00000000008b6528
RDX: 0000000000000000 RSI: 00000000008b64d0 RDI: 00000000008b64d0
RBP: 00000000008b8038 R08: 00000000002579b0 R09: 000000000046ab38
R10: 0000000000000008 R11: 0000000000000000 R12: 000000000005479b
R13: 00000000008b8028 R14: 00000000002afb80 R15: 00000000008b64b0
base is 0x8b8038 caller is 0x400
base is 0x8b6f08 caller is 0x400
base is 0x8000000000000019 GPF rip: 13d21f, error_code=0
RIP: e030:[<000000000013d21f>]
RSP: e02b:00000000002af948 EFLAGS: 00010012
RAX: 000000000000001b RBX: 8000000000000019 RCX: 0000000000000020
RDX: 0000000000000602 RSI: 00000000002af798 RDI: 0000000000000004
RBP: 00000000002af948 R08: 000000000000000a R09: 00000000002af7e8
R10: 0000000000000001 R11: 0000000000000010 R12: 00000000002af988
R13: 00000000008b8028 R14: 00000000002afb80 R15: 00000000008b64b0
base is 0x2af948 caller is 0x28
base is 0x3 Page fault in pagetable walk (access to invalid memory?).
And from the ping VM:
alpine:~# arping -I eth0 10.0.0.2
ARPING to 10.0.0.2 from 10.0.0.3 via eth0
Unicast reply from 10.0.0.2 [00:16:3e:6f:8e:7e] 0.308ms
Unicast reply from 10.0.0.2 [00:16:3e:6f:8e:7e] 0.329ms
^CSent 2 probe(s) (1 broadcast(s))
Received 2 reply (0 request(s), 0 broadcast(s))
alpine:~# ping 10.0.0.2 -c1
PING 10.0.0.2 (10.0.0.2): 56 data bytes
--- 10.0.0.2 ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss
alpine:~#
Both were connected to br0:
root@or-xen-d0:~# tcpdump -nni br0
11:08:25.302691 ARP, Request who-has 10.0.0.2 (ff:ff:ff:ff:ff:ff) tell 10.0.0.3, length 28
11:08:25.302830 ARP, Reply 10.0.0.2 is-at 00:16:3e:6f:8e:7e, length 28
11:08:26.302791 ARP, Request who-has 10.0.0.2 (00:16:3e:6f:8e:7e) tell 10.0.0.3, length 28
11:08:26.302933 ARP, Reply 10.0.0.2 is-at 00:16:3e:6f:8e:7e, length 28
11:08:30.896785 IP 10.0.0.3 > 10.0.0.2: ICMP echo request, id 9223, seq 0, length 64
Attempt to look at RIP:
I guess only two RIP values are intereseting
RIP fcca1
RIP 13d21f
objdump -S network.xen |less
00000000000fcc90 <caml_tcpip_ones_complement_checksum_list>:
fcc90: 48 81 ec c8 00 00 00 sub $0xc8,%rsp
fcc97: 4c 8d 05 12 ad 15 00 lea 0x15ad12(%rip),%r8 # 2579b0 <caml_local_roots>
fcc9e: 48 89 fe mov %rdi,%rsi
fcca1: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
fcca8: 00 00
000000000013d180 <do_page_fault>:
...
13d1fc: e8 8f fb ff ff callq 13cd90 <dump_regs>
13d201: 58 pop %rax
13d202: 5a pop %rdx
13d203: 49 8b 5c 24 20 mov 0x20(%r12),%rbx
13d208: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
13d20f: 00
13d210: 48 89 de mov %rbx,%rsi
13d213: 31 c0 xor %eax,%eax
13d215: bf 70 8c 14 00 mov $0x148c70,%edi
13d21a: e8 71 d5 ff ff callq 13a790 <printk>
13d21f: 48 8b 73 08 mov 0x8(%rbx),%rsi
00000000000fcc90 <caml_tcpip_ones_complement_checksum_list>:
I guess in instruction
fcca1: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
the FS is problematic. I mean, "Page fault at linear address 28", and RAX contains valid function address.
But I don't see FS register value in page-fault-crash-dump. And I do not know how to connect with gdb to VM (I'm thinking about something like for KVM, "qemu-system-xy -s ...", and then "gdb -ex 'connect host:1234' my-vm-kernel").
Anything else I could add/try/help with?
Adding gdb output:
Breakpoint 1 at 0x100ca0: file src/tcpip_checksum/checksum_stubs.c, line 90.
(gdb) c
Continuing.
Breakpoint 1, caml_tcpip_ones_complement_checksum_list (v_cstruct_list=9088488) at src/tcpip_checksum/checksum_stubs.c:90
90 src/tcpip_checksum/checksum_stubs.c: No such file or directory.
(gdb) bt
#0 caml_tcpip_ones_complement_checksum_list (v_cstruct_list=9088488) at src/tcpip_checksum/checksum_stubs.c:90
#1 0x000000000005658d in camlIcmpv4_packet__unsafe_fill_1855 () at src/icmp/icmpv4_packet.ml:98
#2 0x00000000000566e6 in camlIcmpv4_packet__make_cstruct_1868 () at src/icmp/icmpv4_packet.ml:114
#3 0x0000000000056f3a in camlIcmpv4__input_2124 () at src/icmp/icmpv4.ml:68
#4 0x000000000007cb1c in camlLwt__catch_24897 () at src/core/lwt.ml:2036
#5 0x000000000007f375 in camlLwt__async_51102 () at src/core/lwt.ml:2459
#6 0x00000000000baa51 in camlList__iter_1252 () at list.ml:77
#7 0x0000000000018274 in camlNetchannel__Frontend__loop_2456 () at lib/frontend.ml:233
#8 0x000000000007bd62 in camlLwt__callback_13843 () at src/core/lwt.ml:1896
#9 0x000000000007a888 in camlLwt__iter_callback_list_4508 () at src/core/lwt.ml:1230
#10 0x000000000007a9f9 in camlLwt__run_in_resolution_loop_4551 () at src/core/lwt.ml:1296
#11 0x000000000007aba7 in camlLwt__resolve_4567 () at src/core/lwt.ml:1332
#12 0x000000000007bd75 in camlLwt__callback_13843 () at src/core/lwt.ml:1901
#13 0x000000000007a888 in camlLwt__iter_callback_list_4508 () at src/core/lwt.ml:1230
#14 0x000000000007a9f9 in camlLwt__run_in_resolution_loop_4551 () at src/core/lwt.ml:1296
#15 0x000000000007aba7 in camlLwt__resolve_4567 () at src/core/lwt.ml:1332
#16 0x000000000007bd75 in camlLwt__callback_13843 () at src/core/lwt.ml:1901
#17 0x000000000007a888 in camlLwt__iter_callback_list_4508 () at src/core/lwt.ml:1230
#18 0x000000000007a9f9 in camlLwt__run_in_resolution_loop_4551 () at src/core/lwt.ml:1296
#19 0x000000000007aba7 in camlLwt__resolve_4567 () at src/core/lwt.ml:1332
#20 0x000000000007d6e5 in camlLwt__callback_32955 () at src/core/lwt.ml:2161
#21 0x000000000007a888 in camlLwt__iter_callback_list_4508 () at src/core/lwt.ml:1230
#22 0x000000000007a9f9 in camlLwt__run_in_resolution_loop_4551 () at src/core/lwt.ml:1296
#23 0x000000000007aba7 in camlLwt__resolve_4567 () at src/core/lwt.ml:1332
#24 0x000000000007af8c in camlLwt__wakeup_later_general_5640 () at src/core/lwt.ml:1428
#25 0x00000000000baa51 in camlList__iter_1252 () at list.ml:77
#26 0x0000000000041140 in camlOS__Activations__run_1389 () at lib/activations.ml:100
#27 0x0000000000042f66 in camlOS__Main__aux_1405 () at lib/main.ml:69
#28 0x0000000000008de6 in camlMain__entry () at main.ml:177
#29 0x0000000000003b59 in caml_program ()
#30 0x0000000000123e56 in caml_start_program ()
#31 0x0000000000103c18 in caml_main (argv=0x25e5c0 <argv>) at startup.c:145
#32 0x0000000000103c54 in caml_startup (argv=0x25e5c0 <argv>) at startup.c:152
#33 0x0000000000103029 in app_main_thread ()
#34 0x00000000001030b1 in start_kernel ()
#35 0x00000000001409af in arch_init ()
#36 0x0000000000001f80 in shared_info ()
#37 0x0000000000000000 in ?? ()
(gdb) info loc
caml__frame = <optimized out>
caml__roots_v_cstruct_list = {next = 0x8 <_text+8>, ntables = 16502792, nitems = 9715240, tables = {0x1 <_text+1>, 0x8af110, 0x139850 <memmove+848>, 0x1 <_text+1>, 0x8ab230}}
v_hd = 9095504
v_ba = 747416
v_ofs = 9095488
v_len = 1205424
caml__roots_v_hd = {next = 0x2bfae0 <stack+125088>, ntables = 1112677, nitems = 9726592, tables = {0xff0000000d, 0xff <stack_start+232>, 0x10178f <caml_fill_bigstring+77>, 0x1 <_text+1>, 0x5641a <camlIcmpv4_packet__subheader_into_cstruct_1847+90>}}
checksum = <optimized out>
overflow_val = <optimized out>
overflow = <optimized out>
count = <optimized out>
a = <optimized out>
addr = <optimized out>
data64 = <optimized out>
sum64 = <optimized out>
(gdb) info reg
rax 0x100ca0 1051808
rbx 0x8aae20 9088544
rcx 0x11 17
rdx 0x7 7
rsi 0x7 7
rdi 0x8aade8 9088488
rbp 0x8ac950 0x8ac950
rsp 0x2bfaf8 0x2bfaf8 <stack+125112>
r8 0xffffffffffffffff -1
r9 0x453bd0 4537296
r10 0x8 8
r11 0x0 0
r12 0x5658d 353677
r13 0x8ac940 9095488
r14 0x2bfb80 2882432
r15 0x8aadc8 9088456
rip 0x100ca0 0x100ca0 <caml_tcpip_ones_complement_checksum_list>
eflags 0x202 [ IF ]
cs 0x0 0
ss 0xe033 57395
ds 0x0 0
es 0xe02b 57387
fs 0x0 0
gs 0x0 0
(gdb)
(gdb) fr 1
#1 0x000000000005658d in camlIcmpv4_packet__unsafe_fill_1855 () at src/icmp/icmpv4_packet.ml:98
98 src/icmp/icmpv4_packet.ml: No such file or directory.
(gdb) info loc
No locals.
(gdb) info reg
rax 0x100ca0 1051808
rbx 0x8aae20 9088544
rcx 0x11 17
rdx 0x7 7
rsi 0x7 7
rdi 0x8aade8 9088488
rbp 0x8ac950 0x8ac950
rsp 0x2bfb00 0x2bfb00 <stack+125120>
r8 0xffffffffffffffff -1
r9 0x453bd0 4537296
r10 0x8 8
r11 0x0 0
r12 0x5658d 353677
r13 0x8ac940 9095488
r14 0x2bfb80 2882432
r15 0x8aadc8 9088456
rip 0x5658d 0x5658d <camlIcmpv4_packet__unsafe_fill_1855+205>
eflags 0x202 [ IF ]
cs 0x0 0
ss 0xe033 57395
ds 0x0 0
es 0xe02b 57387
fs 0x0 0
gs 0x0 0
(gdb)
(gdb) fr 2
#2 0x00000000000566e6 in camlIcmpv4_packet__make_cstruct_1868 () at src/icmp/icmpv4_packet.ml:114
114 in src/icmp/icmpv4_packet.ml
(gdb) info loc
No locals.
(gdb) info reg
rax 0x100ca0 1051808
rbx 0x8aae20 9088544
rcx 0x11 17
rdx 0x7 7
rsi 0x7 7
rdi 0x8aade8 9088488
rbp 0x8ac950 0x8ac950
rsp 0x2bfb20 0x2bfb20 <stack+125152>
r8 0xffffffffffffffff -1
r9 0x453bd0 4537296
r10 0x8 8
r11 0x0 0
r12 0x5658d 353677
r13 0x8ac940 9095488
r14 0x2bfb80 2882432
r15 0x8aadc8 9088456
rip 0x566e6 0x566e6 <camlIcmpv4_packet__make_cstruct_1868+70>
eflags 0x202 [ IF ]
cs 0x0 0
ss 0xe033 57395
ds 0x0 0
es 0xe02b 57387
fs 0x0 0
gs 0x0 0
(gdb)
(gdb) fr 3
#3 0x0000000000056f3a in camlIcmpv4__input_2124 () at src/icmp/icmpv4.ml:68
68 src/icmp/icmpv4.ml: No such file or directory.
(gdb) info loc
No locals.
(gdb) info reg
rax 0x100ca0 1051808
rbx 0x8aae20 9088544
rcx 0x11 17
rdx 0x7 7
rsi 0x7 7
rdi 0x8aade8 9088488
rbp 0x8ac950 0x8ac950
rsp 0x2bfb40 0x2bfb40 <stack+125184>
r8 0xffffffffffffffff -1
r9 0x453bd0 4537296
r10 0x8 8
r11 0x0 0
r12 0x5658d 353677
r13 0x8ac940 9095488
r14 0x2bfb80 2882432
r15 0x8aadc8 9088456
rip 0x56f3a 0x56f3a <camlIcmpv4__input_2124+858>
eflags 0x202 [ IF ]
cs 0x0 0
ss 0xe033 57395
ds 0x0 0
es 0xe02b 57387
fs 0x0 0
gs 0x0 0
(gdb)
Seems OCaml has "its own" stack, as "info loc" keeps saying "no locals". Per backtrace, ICMP reply is assambled (https://github.com/mirage/mirage-tcpip/blob/master/src/icmp/icmpv4.ml#L68, frame 3) and tried to be sent with vritew, and at that point code goes south. I know 0% of OCaml, so I'm pretty lost now :)
Which variable is being dereferenced? I guess some struct with address 0x0 at offset 28.
I added a minimal (I believe it is minimal) code to reproduce, based on tutorial/hello/. It is at https://github.com/justinc1/mirage-skeleton/tree/jc-crash-issue-251, commit https://github.com/justinc1/mirage-skeleton/commit/1da971caae5295d47588b6dc9637e68e2b0c7f89
unikernel.ml part:
let aa = 11 in
(* let bb = (times2 aa) in *)
let bb = (read_int aa; 33) in
(* let bb = return_int() in *)
let oc = stderr in
output_string oc "bb = 2*aa = ";
output_string oc (string_of_int bb);
output_string oc " = 2*";
output_string oc (string_of_int aa);
output_string oc ";\n";
And relevant part in times2.c:
void read_int_real(int aa)
{
asm(""); // will prevent optimization (e.g. removing whole function)?
aa = aa*2;
}
CAMLprim value read_int(value aa)
{
CAMLparam1(aa);
read_int_real(Int_val(aa));
CAMLreturn(Val_unit);
}
If i remember correctly, just doing Int_val(aa)
is enough to trigger crash. What now?
(And appologize for having jc-build-xen.sh script - I don't know how to include C code into ocaml :/ )
Building same example code on ubuntu 14.04.5 - no crash. Problem with gcc used on build platform (gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.4 vs gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9))? Or maybe some other used tool (I'm know nothing about what everything is used to build mirage apps)?
@justinc1 Could you post a disassembly of the same function built with the working toolchain (ubuntu 14.04.5 I presume from the above)? A wild guess is that the newer toolchain is using TLS/FS-relative addressing when it shouldn't be, but would like to confirm that.
ubuntu 14.04.5, opam switch 4.04.2, gcc 4.8.4-2ubuntu1~14.04.4):
objdump -S hello_cstruct.xen1
...
0000000000006e20 <camlUnikernel__start_1212>:
...
let aa = 11 in
(* let bb = (times2 aa) in *)
let bb = (read_int aa; 33) in
6e76: 48 8d 05 13 2b 10 00 lea 0x102b13(%rip),%rax # 109990 <read_int>
6e7d: e8 0e 5e 0e 00 callq ecc90 <caml_c_call>
6e82: 4c 8d 1d 9f 2f 1e 00 lea 0x1e2f9f(%rip),%r11 # 1e9e28 <caml_young_ptr>
6e89: 4d 8b 3b mov (%r11),%r15
6e8c: 48 8d 05 c5 88 1a 00 lea 0x1a88c5(%rip),%rax # 1af758 <camlPervasives>
6e93: 48 8b 80 c0 00 00 00 mov 0xc0(%rax),%rax
6e9a: 48 89 04 24 mov %rax,(%rsp)
6e9e: 48 8d 1d 33 bb 13 00 lea 0x13bb33(%rip),%rbx # 1429d8 <camlUnikernel__16>
(* let bb = return_int() in *)
let oc = stderr in
output_string oc "bb = 2*aa = ";
6ea5: e8 e6 b7 07 00 callq 82690 <camlPervasives__output_string_1203>
6eaa: 48 c7 c0 43 00 00 00 mov $0x43,%rax
output_string oc (string_of_int bb);
...
0000000000083200 <camlPervasives__read_int_1319>:
83200: 48 83 ec 08 sub $0x8,%rsp
83204: 48 c7 c0 01 00 00 00 mov $0x1,%rax
8320b: e8 b0 ff ff ff callq 831c0 <camlPervasives__read_line_1317>
83210: 48 89 c7 mov %rax,%rdi
83213: 48 8d 05 34 1f 05 00 lea 0x51f34(%rip),%rax # d514e <caml_int_of_string>
8321a: e8 71 9a 06 00 callq ecc90 <caml_c_call>
8321f: 4c 8d 1d 02 6c 16 00 lea 0x166c02(%rip),%r11 # 1e9e28 <caml_young_ptr>
83226: 4d 8b 3b mov (%r11),%r15
83229: 48 83 c4 08 add $0x8,%rsp
8322d: c3 retq
8322e: 66 90 xchg %ax,%ax
...
0000000000109980 <read_int_real>:
109980: c3 retq
109981: 66 66 66 66 66 66 2e data16 data16 data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
109988: 0f 1f 84 00 00 00 00
10998f: 00
0000000000109990 <read_int>:
109990: 55 push %rbp
109991: 53 push %rbx
109992: 48 83 ec 58 sub $0x58,%rsp
109996: 48 8d 1d 73 ff 0d 00 lea 0xdff73(%rip),%rbx # 1e9910 <caml_local_roots>
10999d: 48 8d 44 24 10 lea 0x10(%rsp),%rax
1099a2: 48 89 7c 24 08 mov %rdi,0x8(%rsp)
1099a7: 48 d1 ff sar %rdi
1099aa: 48 c7 44 24 20 01 00 movq $0x1,0x20(%rsp)
1099b1: 00 00
1099b3: 48 c7 44 24 18 01 00 movq $0x1,0x18(%rsp)
1099ba: 00 00
1099bc: 48 8b 2b mov (%rbx),%rbp
1099bf: 48 89 03 mov %rax,(%rbx)
1099c2: 48 8d 44 24 08 lea 0x8(%rsp),%rax
1099c7: 48 89 44 24 28 mov %rax,0x28(%rsp)
1099cc: 48 89 6c 24 10 mov %rbp,0x10(%rsp)
1099d1: e8 aa ff ff ff callq 109980 <read_int_real>
1099d6: 48 89 2b mov %rbp,(%rbx)
1099d9: 48 83 c4 58 add $0x58,%rsp
1099dd: b8 01 00 00 00 mov $0x1,%eax
1099e2: 5b pop %rbx
1099e3: 5d pop %rbp
1099e4: c3 retq
1099e5: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
1099ec: 00 00 00 00
ubuntu 16.04.3, ocaml switch 4.05.0, 5.4.0-6ubuntu1~16.04.9
0000000000006e10 <camlUnikernel__start_1217>:
...
let aa = 11 in
(* let bb = (times2 aa) in *)
let bb = (read_int aa; 33) in
6e66: 48 8d 05 53 02 10 00 lea 0x100253(%rip),%rax # 1070c0 <read_int>
6e6d: e8 7a 27 0e 00 callq e95ec <caml_c_call>
6e72: 4c 8d 1d af 37 1e 00 lea 0x1e37af(%rip),%r11 # 1ea628 <caml_young_ptr>
6e79: 4d 8b 3b mov (%r11),%r15
6e7c: 48 8d 05 ad 75 1a 00 lea 0x1a75ad(%rip),%rax # 1ae430 <camlPervasives>
6e83: 48 8b 80 d8 00 00 00 mov 0xd8(%rax),%rax
6e8a: 48 89 04 24 mov %rax,(%rsp)
6e8e: 48 8d 1d 43 9b 13 00 lea 0x139b43(%rip),%rbx # 1409d8 <camlUnikernel__16>
(* let bb = return_int() in *)
let oc = stderr in
...
0000000000084bc0 <camlPervasives__read_int_1333>:
84bc0: 48 83 ec 08 sub $0x8,%rsp
84bc4: 48 c7 c0 01 00 00 00 mov $0x1,%rax
84bcb: e8 b0 ff ff ff callq 84b80 <camlPervasives__read_line_1330>
84bd0: 48 89 c7 mov %rax,%rdi
84bd3: 48 8d 05 36 2e 05 00 lea 0x52e36(%rip),%rax # d7a10 <caml_int_of_string>
84bda: e8 0d 4a 06 00 callq e95ec <caml_c_call>
84bdf: 4c 8d 1d 42 5a 16 00 lea 0x165a42(%rip),%r11 # 1ea628 <caml_young_ptr>
84be6: 4d 8b 3b mov (%r11),%r15
84be9: 48 83 c4 08 add $0x8,%rsp
84bed: c3 retq
84bee: 66 90 xchg %ax,%ax
0000000000084bf0 <camlPervasives__read_int_opt_1336>:
84bf0: 48 83 ec 08 sub $0x8,%rsp
84bf4: 48 c7 c0 01 00 00 00 mov $0x1,%rax
84bfb: e8 80 ff ff ff callq 84b80 <camlPervasives__read_line_1330>
84c00: 48 83 c4 08 add $0x8,%rsp
84c04: e9 27 f0 ff ff jmpq 83c30 <camlPervasives__int_of_string_opt_1153>
84c09: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
...
00000000001070b0 <read_int_real>:
1070b0: c3 retq
1070b1: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
1070b6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
1070bd: 00 00 00
00000000001070c0 <read_int>:
1070c0: 55 push %rbp
1070c1: 53 push %rbx
1070c2: 48 83 ec 68 sub $0x68,%rsp
1070c6: 48 8d 1d 7b 31 0e 00 lea 0xe317b(%rip),%rbx # 1ea248 <caml_local_roots>
1070cd: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
1070d4: 00 00
1070d6: 48 89 44 24 58 mov %rax,0x58(%rsp)
1070db: 31 c0 xor %eax,%eax
1070dd: 48 8d 44 24 10 lea 0x10(%rsp),%rax
1070e2: 48 89 7c 24 08 mov %rdi,0x8(%rsp)
1070e7: 48 d1 ff sar %rdi
1070ea: 48 8b 2b mov (%rbx),%rbp
1070ed: 48 c7 44 24 20 01 00 movq $0x1,0x20(%rsp)
1070f4: 00 00
1070f6: 48 89 03 mov %rax,(%rbx)
1070f9: 48 8d 44 24 08 lea 0x8(%rsp),%rax
1070fe: 48 c7 44 24 18 01 00 movq $0x1,0x18(%rsp)
107105: 00 00
107107: 48 89 6c 24 10 mov %rbp,0x10(%rsp)
10710c: 48 89 44 24 28 mov %rax,0x28(%rsp)
107111: e8 9a ff ff ff callq 1070b0 <read_int_real>
107116: 48 8b 54 24 58 mov 0x58(%rsp),%rdx
10711b: 64 48 33 14 25 28 00 xor %fs:0x28,%rdx
107122: 00 00
107124: 48 89 2b mov %rbp,(%rbx)
107127: 75 0c jne 107135 <read_int+0x75>
107129: 48 83 c4 68 add $0x68,%rsp
10712d: b8 01 00 00 00 mov $0x1,%eax
107132: 5b pop %rbx
107133: 5d pop %rbp
107134: c3 retq
107135: e8 d6 fe ff ff callq 107010 <__stack_chk_fail>
10713a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
I am experiencing the same issue with the static_website_tls
and conduit_server
examples.
Thanks for your report, now 4 years later and MirageOS 4.0 being released (with a reworked Xen story that supports PVH), I don't think this issue is relevant anymore. If you encounter the same issue with a more current OCaml and MirageOS, please report a fresh issue here. Thanks a lot.
I have built and run the conduit_server example for Xen but unfortunately it goes into page fault when I try to contact it with a simple
echo -n "test" | nc 131.159.24.190 80
. Log follows:Ocaml is version 4.04.2 and I'm running the example on an Intel NUC. This is the output of
opam list
:This is the output of
sudo xl info
I found this problem because also my cohttp-dependant unikernels started going into page fault as soon as i started testing them on the Intel NUC. On other x86 or ARM (cubietruck) platforms the problem doesn't appear. It might be a problem related to the board but I'm not completely sure about it.