Closed LedgeDash closed 5 years ago
A lot of the changes in this PR are now unrelated to the title so it's hard to review. I have some substantive comments about the workload generator, and I will mark up the rest as I see things, but I think we should merge this ASAP
Yes, I think it'd be great to merge this asap. I've updated the title since the PR content has changed quite significantly. Nevertheless, the workload generator code is separate enough from measurement and plotting related code such that we could just treat this PR as 2 PRs combined.
I'm updating the PR description (the first comment) to include my new changes.
I've updated the workload generator incorporating @alevy 's comments. Inspired by @alevy 's insight that we could get rid of num_invocations
altogether. I changed the interface to specifying a spike window for each function. Details are in my updated PR description (first comment). This I believe makes the generator more flexible and nicely makes the workload YAML file a chronological outline of a workload which is more intuitive and easier for our experiments' purposes.
I also got rid of arrival rates
as that was a source of confusion and switched directly to mu
the average inter-arrival time. And all time values are now in ms.
workload generator (updated)
Generate workload by specifying
mu
) in msstart time
andend time
)for each function in a workload characteristic YAML file.
Call it with:
python3 generator.py <workload.yaml> <request.json>
<workload.yaml>
is the input.<request.json>
is the output request fileHere's an example
example_workload.yaml
:This YAML generates a workload of 2 functions
loremjs
andlorempy2
.loremjs
's spike begin at timestamp=0ms and ends at timestamp=5000ms.lorempy2
's spike begin at timestamp=5000ms and ends at timestamp=10000ms. During non-spike periods (that is [5000, 10000] forloremjs
and [0,5000] forlorempy2
), functions have a default average inter-arrival time of 1000ms (i.e., average of 1 req/sec), currently hard-coded.If you want a function to have multiple spikes, you would for example do this:
It is desirable to list functions in chronological order so that the workload YAML file basically outlines the timeline of the workload.
Time measurement
In order to measure utilization over time (think of a plot of utilization where x-axis is time), I added code in controller that outputs timestamps when certain events happen. These events are: VM boot start (calling the
.run()
function onVmAppConfig
), controller receiving the tty ready message from a VM, request sent, response received, VM eviction start, VM eviction finished.All results are output as a json string to is the timestamp of experiment start time (i.e., right before the first request is scheduled) and is the timestamp of experiment finished.
./measurement/measurement-<start_time>-<end_time>.json
file, whereIn the json string, you will see something like this:
(the
...
just means there are more elements, but omitted to save space) For example,"request/response timestamps"
object has all the timestamp of all requests and response from and to all VMs. The"10"
is VM id. the 6 numbers are 3 pairs of (request_send_timestamp, response_received_timestamp), sequentially.With all these information, we then know exactly what happened when throughput the entire duration of a workload. The
plot.py
script takes this information to calculate and plot utilization.Next steps (need Cgroup CPU share feature completed first):