princeton-sns / firecracker-tools

5 stars 5 forks source link

firerunner vsock connection fails but controller works #13

Open LedgeDash opened 5 years ago

LedgeDash commented 5 years ago

Calling a function through firerunner directly fails with zero output:

$ cat single_req.json | ./target/debug/firerunner --kernel vmlinux --rootfs images/nodejs.ext4 --appfs images/loremjs.ext4

^C
$

No messaging showing vsock connection succeeds. And after a while the VM exits so ctrl-C just kills the firerunner process.

However, controller works fine:

[luzhuo@Jasper] ./target/debug/controller --appfs_dir images --kernel vmlinux --runtimefs_dir images --requests ctest.jsonl
lorempy2: {"body": "Dolorem voluptatem aliquam quaerat ut dolor.", "request": {"function": "lorempy2", "payload": {"request": 42}}}
Warm 0, Active: 2
loremjs: {"request":{"function":"loremjs","payload":{"request":42}},"sentence":"Aliqua magna anim id ullamco officia exercitation officia ea nostrud esse."}
Warm 0, Active: 2
lorempy2: {"body": "Ut dolor numquam modi.", "request": {"function": "lorempy2", "payload": {"request": 42}}}
Warm 0, Active: 2
loremjs: {"request":{"function":"loremjs","payload":{"request":42}},"sentence":"Ad do adipisicing incididunt ullamco labore nulla Lorem occaecat ut."}
Warm 0, Active: 2
lorempy2: {"body": "Aliquam porro sed neque sed sit.", "request": {"function": "lorempy2", "payload": {"request": 42}}}
Warm 0, Active: 2
loremjs: {"request":{"function":"loremjs","payload":{"request":42}},"sentence":"Ut elit et nisi ipsum ex sint qui nulla officia reprehenderit consectetur enim."}
Warm 0, Active: 2
lorempy2: {"body": "Quiquia eius tempora modi tempora.", "request": {"function": "lorempy2", "payload": {"request": 42}}}
Warm 0, Active: 2
loremjs: {"request":{"function":"loremjs","payload":{"request":42}},"sentence":"Ut est duis adipisicing incididunt labore elit id irure commodo magna id nulla ea duis."}
Warm 0, Active: 2
lorempy2: {"body": "Velit labore etincidunt velit etincidunt consectetur eius labore.", "request": {"function": "lorempy2", "payload": {"request": 42}}}
Warm 0, Active: 2
loremjs: {"request":{"function":"loremjs","payload":{"request":42}},"sentence":"Occaecat est fugiat quis sit aliquip exercitation duis dolore excepteur mollit."}
Warm 0, Active: 2
lorempy2: {"body": "Quisquam dolorem dolor aliquam modi non dolorem.", "request": {"function": "lorempy2", "payload": {"request": 42}}}
Warm 0, Active: 2
loremjs: {"request":{"function":"loremjs","payload":{"request":42}},"sentence":"Ex cupidatat deserunt ad amet occaecat cupidatat minim nostrud deserunt amet deserunt cillum ullamco."}
Warm 0, Active: 2
lorempy2: {"body": "Numquam tempora porro magnam dolore sed quaerat.", "request": {"function": "lorempy2", "payload": {"request": 42}}}
Warm 1, Active: 1
loremjs: {"request":{"function":"loremjs","payload":{"request":42}},"sentence":"Ut consectetur nisi minim magna dolore nulla culpa minim enim incididunt cillum ullamco sit."}
Warm 2, Active: 0

This is related to #12 . Need to improve code around vsock handling.

LedgeDash commented 5 years ago

It's not always the case that VM would exit. Sometimes the VM seems to be stuck and can only be terminated with kill since the VM treats SIGINT (ctrl-c) as nop.

LedgeDash commented 5 years ago

It is also possible for the following to happen:

[davidliu@sns59] ./target/debug/firerunner --kernel ../images/vmlinux --rootfs images/nodejs.ext4 --appfs loremjs.ext4
Connection from VsockAddr { port: 1024, cid: 42 }

vsock connection seems to be successful but nothing seems to go through. and the vm just hangs.

Not sure if this is related at all to @tan-yue 's problem with vsock when booting from a snapshot...?? Maybe our currently implementation around vsock is just broken...

LedgeDash commented 5 years ago

Problem is "resolved" after system reboot. So likely there's something wrong with vsock states in the kernel module. function VM can run and output results but now I see:

[luzhuo@Jasper] ./target/debug/firerunner --kernel vmlinux --rootfs images/nodejs.ext4 --appfs images/loremjs.ext4 < single_req.json
Connection from VsockAddr { port: 1024, cid: 42 }
{"request":{"function":"lorempy2","payload":{}},"sentence":"Sunt non sit ullamco adipisicing officia ipsum in ad commodo."}
thread 'main' panicked at 'Failed to kill child: Sys(ESRCH)', src/libcore/result.rs:999:5
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

This is panic is happening with Amit's images. (previously, I had this panic issue with images that i created (#11),but it went away when testing with Amit's images. )

alevy commented 5 years ago

We are almost definitely getting rid of vsock, so probably not worth debugging