I'm running ops with mounted through VirtFS directory. When closing a file after unlinking, internal error happens:
cannot get from unknown value (e 0x0000000000000000, a 0xffff8080000006c0)
Reproduce
Consider following C code:
main.c:
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
int main()
{
const char *path = "/tmp/.tmp123456";
// create and open file
int fd = open(path, O_WRONLY | O_CREAT);
// unlink file
unlink(path);
// close file
close(fd);
}
Compiling the code and running it with ops inside unikernel:
gcc -o main main.c && ops run --mounts /tmp:/tmp ./main
Expected result
Everything works fine, file is created and removed at the end of the execution. As it works on Linux platforms, if an opened file is unlinked, it's going to be removed after closing. It's a pretty common pattern create-unlink-close for temp files.
Actual result
Error happens, machine halts:
running local instance
booting /home/alea/.ops/images/main ...
en1: assigned 10.0.2.15
cannot get from unknown value (e 0x0000000000000000, a 0xffff808000000640)
I tested the same with --nightly and with local build of master branch and the same error occurs.
What I found
I investigated this error a bit and built nanos with enabled debug prints and here is what I got:
running local instance
booting /home/alea/.ops/images/main ...
FS: filesystem_read_entire: t 0x7fc601c4, bufheap 0x00017054, buffer_handler 0x7fbdd000, status_handler 0x7fc609b4
FS: read_entire_complete: status [invalid format %v], addr 0x00000000
en1: assigned 10.0.2.15
[0.129957, 0, fs] filesystem_read_complete: dest 0xffffc00001809000, status (none)
[0.130705, 0, fs] filesystem_read_complete: dest 0xffffc0000180b000, status (none)
[0.132992, 0, 9p] lookup 0xffff80c000000d40 '.'
[0.133376, 0, 9p] readdir: fid 0, md 0xffff80c000000d40
[0.134236, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-upower.service-6aeKbv', type 0x4, md 0xffff80c000001840
[0.135340, 0, 9p] dir entry '{ACE5978A-EA83-4BD9-94C1-3ECC0B976E94}', type 0x4, md 0xffff80c0000018c0
[0.136373, 0, 9p] dir entry 'gdm3-config-err-7xxwD5', type 0x8, md 0xffff80c000001940
[0.137135, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-bluetooth.service-Z64H7h', type 0x4, md 0xffff80c000001980
[0.138242, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-systemd-logind.service-2vFixi', type 0x4, md 0xffff80c000001a00
[0.139417, 0, 9p] dir entry '.ICE-unix', type 0x4, md 0xffff80c000001a80
[0.140026, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-switcheroo-control.service-3sPe4i', type 0x4, md 0xffff80c000001b40
[0.141219, 0, 9p] dir entry '.Test-unix', type 0x4, md 0xffff80c000001bc0
[0.141811, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-systemd-resolved.service-Ov0tLq', type 0x4, md 0xffff80c000001c40
[0.142957, 0, 9p] dir entry 'snap-private-tmp', type 0x4, md 0xffff80c000001cc0
[0.143594, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-systemd-oomd.service-XoA1ko', type 0x4, md 0xffff80c000001b00
[0.144722, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-bolt.service-2IPtLt', type 0x4, md 0xffff80c000001d80
[0.145796, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-fwupd.service-jYObch', type 0x4, md 0xffff80c000001e00
[0.146888, 0, 9p] dir entry '.font-unix', type 0x4, md 0xffff80c000001e80
[0.147501, 0, 9p] dir entry '.X11-unix', type 0x4, md 0xffff80c000001f00
[0.148108, 0, 9p] dir entry '.XIM-unix', type 0x4, md 0xffff80c000001f80
[0.148713, 0, 9p] dir entry 'vscode-typescript1000', type 0x4, md 0xffff80c000002000
[0.149386, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-colord.service-0brCku', type 0x4, md 0xffff80c000002080
[0.150747, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-power-profiles-daemon.service-Ug3vQW', type 0x4, md 0xffff80c000002100
[0.151960, 0, 9p] dir entry '{D296E373-03DF-41ED-97FA-70A62B7B6462}', type 0x4, md 0xffff80c000002180
[0.152798, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-systemd-timesyncd.service-IRq2xq', type 0x4, md 0xffff80c000002200
[0.153981, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-ModemManager.service-nUpc4E', type 0x4, md 0xffff80c000002280
[0.155189, 0, 9p] lookup 0xffff80c000000d40 '.tmp123456'
[0.155717, 0, 9p] create parent 0xffff80c000000d40 name '.tmp123456' md 0xffff80c000002300 f 0xffffc0000283ff00
[0.156860, 0, 9p] lookup 0xffff80c000000d40 '.'
[0.157249, 0, 9p] readdir: fid 0, md 0xffff80c000000d40
[0.157946, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-upower.service-6aeKbv', type 0x4, md 0xffff80c000001840
[0.159052, 0, 9p] dir entry '{ACE5978A-EA83-4BD9-94C1-3ECC0B976E94}', type 0x4, md 0xffff80c0000018c0
[0.159889, 0, 9p] dir entry 'gdm3-config-err-7xxwD5', type 0x8, md 0xffff80c000001940
[0.160591, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-bluetooth.service-Z64H7h', type 0x4, md 0xffff80c000001980
[0.161719, 0, 9p] dir entry '.tmp123456', type 0x8, md 0xffff80c000002300
[0.162338, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-systemd-logind.service-2vFixi', type 0x4, md 0xffff80c000001a00
[0.163710, 0, 9p] dir entry '.ICE-unix', type 0x4, md 0xffff80c000001a80
[0.164339, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-switcheroo-control.service-3sPe4i', type 0x4, md 0xffff80c000001b40
[0.165573, 0, 9p] dir entry '.Test-unix', type 0x4, md 0xffff80c000001bc0
[0.166211, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-systemd-resolved.service-Ov0tLq', type 0x4, md 0xffff80c000001c40
[0.167451, 0, 9p] dir entry 'snap-private-tmp', type 0x4, md 0xffff80c000001cc0
[0.168138, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-systemd-oomd.service-XoA1ko', type 0x4, md 0xffff80c000001b00
[0.169345, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-bolt.service-2IPtLt', type 0x4, md 0xffff80c000001d80
[0.170461, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-fwupd.service-jYObch', type 0x4, md 0xffff80c000001e00
[0.171600, 0, 9p] dir entry '.font-unix', type 0x4, md 0xffff80c000001e80
[0.172238, 0, 9p] dir entry '.X11-unix', type 0x4, md 0xffff80c000001f00
[0.172865, 0, 9p] dir entry '.XIM-unix', type 0x4, md 0xffff80c000001f80
[0.173495, 0, 9p] dir entry 'vscode-typescript1000', type 0x4, md 0xffff80c000002000
[0.174220, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-colord.service-0brCku', type 0x4, md 0xffff80c000002080
[0.175357, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-power-profiles-daemon.service-Ug3vQW', type 0x4, md 0xffff80c000002100
[0.176613, 0, 9p] dir entry '{D296E373-03DF-41ED-97FA-70A62B7B6462}', type 0x4, md 0xffff80c000002180
[0.177493, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-systemd-timesyncd.service-IRq2xq', type 0x4, md 0xffff80c000002200
[0.178752, 0, 9p] dir entry 'systemd-private-f49426b23608411081a60f811a69154c-ModemManager.service-nUpc4E', type 0x4, md 0xffff80c000002280
[0.180038, 0, 9p] lookup 0xffff80c000000d40 '.tmp123456'
[0.180724, 0, 9p] free file 0xffffc0000020cd00, dentry 0xffffc00000473e40, md 0x0000000000000000
cannot get from unknown value (e 0x0000000000000000, a 0xffff808000000640)
So it looks like after close() syscall p9_fsf_free function is called. But md (I guess it stands for metadata) for that file was already destroyed. Because of that, when calling fs_file_is_busy later, it fails to unwrap this tuple.
I'm not sure, where the actual bug is (maybe md shouldn't be destroyed that early since we need to check if file is watched). But anyway this behavior is different from Linux one and looks undesirable.
Description
I'm running
ops
with mounted through VirtFS directory. When closing a file after unlinking, internal error happens:Reproduce
Consider following C code:
main.c
:Compiling the code and running it with
ops
inside unikernel:Expected result
Everything works fine, file is created and removed at the end of the execution. As it works on Linux platforms, if an opened file is unlinked, it's going to be removed after closing. It's a pretty common pattern create-unlink-close for temp files.
Actual result
Error happens, machine halts:
Setup
I tested the same with
--nightly
and with local build of master branch and the same error occurs.What I found
I investigated this error a bit and built nanos with enabled debug prints and here is what I got:
So it looks like after
close()
syscallp9_fsf_free
function is called. Butmd
(I guess it stands for metadata) for that file was already destroyed. Because of that, when callingfs_file_is_busy
later, it fails to unwrap this tuple.I'm not sure, where the actual bug is (maybe
md
shouldn't be destroyed that early since we need to check if file is watched). But anyway this behavior is different from Linux one and looks undesirable.Any help would be appreciated.