aws lambda scheduling algorithm

LedgeDash commented 5 years ago

WARNING =) The current code as it stands at commit ce6a20dd does NOT work T_T. The problem is that vmm.start_instance().expect("start") panics with:

thread 'main' panicked at 'Start: StartMicrovm(Internal, LegacyIOBus(StdinHandle(Os { code: 5, kind: Other, message: "Input/output error" })))', src/libcore/result.rs:999:5

Now about the code/implementation. What hasn't changed: I haven't removed anything from previous version of the controller because I wanted to have some reference to test against. So you'll still see old data structures such as active_functions, warm_functions. But they'll be removed once we have a fully functional controller.

The overall design stays largely the same, where we have a ConnectionManager thread per VM that sits in between the controller thread and actual VM. The ConnectionManager thread consumes requests from the controller thread via mpsc::channel and forwards requests via vsock to the VM. For responses, the ConnectionManager thread gets responses from a VM and forwards it to a response receiver thread who prints it to console.

The current implementation still uses vsock which we need to change soon. But the code is mostly hidden inside the listener module. So from a controller/scheduler perspective, it just holds a Sender<Request> per VM and is oblivious to how ConnectionManager thread actually communicates with the VM.

What's changed: In the new code, the cluster type represents the physical cluster and is used to keep track of hardware resource limits. Currently it only supports one machine so it just read the local /proc/meminfo and /proc/cpuinfo.

The Vm struct represents a vm from a mgmt perspective. It holds a vm_handle (currently just using cid. Will change after moved to tty), a request sender, and the VmApp. So with an Vm struct instance, we can send requests to a VM and kill/evict an VM. Now the vm_handle is also sent back in the response so that we can know which vm finished running.

For each function, there's a running list and an idle list, both are Vec<Vm>. All running lists are inside running_functions which is a BTreeMap<String, Vec<Vm>> and all idle lists idle_functions.

Now the aws_schedule() function is the lambda scheduling algorithm. Its logic is in this basecamp post. I haven't implemented the evict step yet because need to make sure vm can actually run =)

tan-yue commented 5 years ago

It looks like the panic comes from firecracker/sys_util/src/terminal.rs:33 where tcsetattr gets called. And based on this page. EIO means

The process group of the writing process is orphaned, the calling thread is not blocking SIGTTOU, and the process is not ignoring SIGTTOU

LedgeDash commented 5 years ago

@alevy Help! =)

Yue and I spent around an hour trying to debug the issue but couldn't figure out. The error seems to be around the terminal device. I checked the VmAppConfig and VmApp instances against the previous working version and they look exactly the same.

I'll keep digging into it. But could you please also take a look?

alevy commented 5 years ago

I’ll take a look today.

LedgeDash commented 5 years ago

Controller with new scheduler is working now. tested with concurrency limit = 1000, i.e., having multiple running vms for each function. Resource allocation tracking, run queue and idle queue mgmt all seem to work. The problem was just that the controller main thread was exiting too soon sometimes even before vsock connection manages to establish. So added logic to wait until all requests finish. Next step (should be able to finish tomorrow)

eviction logic in scheduler
test with new workload file that added inter-arrival time (already have a script that does the generation)
automatically kill all vms when controller exits (lower priority)
integrate with tty implementation

alevy commented 5 years ago

@LedgeDash I would integrate with tty (i.e. rebase from master) first, then merge this, then do the rest.

LedgeDash commented 5 years ago

booting from snapshot is tested with controller: controller branch: scheduler, commit: 79603aba. Firecracker commit: 76fed4bb.

workload includes only 2 functions lorempy and loremjs. Concurrency limits tested with 1, 10 and 100.

command line options added:

--debug: control whether to close VMs' stdout
--snap: control whether to boot from snapshots. This allows us to have only one function config yaml file with load_dir field specified.

LedgeDash commented 5 years ago

I think this is ready to merge. Working on workloads with inter-arrival time added while waiting for a final quick review.

princeton-sns / firecracker-tools

aws lambda scheduling algorithm #20