spcl / serverless-benchmarks

SeBS: serverless benchmarking suite for automatic performance analysis of FaaS platforms.
https://mcopik.github.io/projects/sebs/
BSD 3-Clause "New" or "Revised" License
142 stars 64 forks source link

Benchmark synthesis #24

Open mcopik opened 4 years ago

mcopik commented 4 years ago

We need the following:

mcopik commented 1 year ago

We made progress on this issue on branch meta-benchmarks and in PR #59. However, there is still work to be done - any input and help towards synthesizing benchmarks are welcome!

veenaamb commented 1 year ago

I will research on this and update here shortly

AtulRajput01 commented 1 year ago

i am working on it.

octonawish-akcodes commented 6 months ago

@mcopik Can i get some guidance to this issue?

mcopik commented 6 months ago

@octonawish-akcodes Hi! The overall idea is to synthetically create Python/JS functions that perform CPU computations, memory accesses, and I/O accesses. Given a simple configuration, it should generate a function that performs selected actions with a specified frequency and intensity, e.g., calling some well-established CPU benchmark (like matrix-matrix multiplication), using our interface to make storage calls, etc.

The next step will be to make these functions more varied, e.g., with different loop complexity.

octonawish-akcodes commented 6 months ago

Can you also provide me with some resources and target files for the start?

mcopik commented 6 months ago

@octonawish-akcodes I'd look in what can be reused from the prior PR: https://github.com/spcl/serverless-benchmarks/pull/59/files

I wouldn't try to merge new updates into it as it's quite difficult. Instead, I'd cherry-pick some of the files you find useful.

MinhThieu145 commented 5 months ago

Hello @mcopik,

Thank you for outlining the specific benchmarks you're interested in: computation in FLOPS/instructions, memory allocation, storage read/write, and disk read/write. I've reviewed our current benchmark suite, and here is what I found:

Could you please provide more details on the specific improvements or additional metrics you're looking to incorporate? Currently I have a few ideas but I would be appreciate if you have any additional sources I can look into

mcopik commented 5 months ago

@MinhThieu145 I think the best way forward would be to add a generator that accepts a simple config - CPU ops, memory ops, storage ops - and synthesizes a single Python function out of the components you just described. Do you think it's feasible?

I'd like to hear about other ideas you might have here :)

octonawish-akcodes commented 5 months ago

@mcopik So did you mean creating simple functions that does the following operations you're providing, cant we reuse the functions proposed in the PR #59

mcopik commented 5 months ago

@octonawish-akcodes Yes, please feel free to reuse the code snippets.

@octonawish-akcodes @MinhThieu145 Since you are both interested in the issue, it might be beneficial to coordinate.

MinhThieu145 commented 5 months ago

Hi @mcopik ,

I'm leaning towards writing functions that are similar to the current, pre-written one, rather than creating a dynamic generator. From my POV,

With the pre-written functions, here are sth that can be dynamic that I think would be helpful

This way, it can make the pre-written more dynamic. Looking forward to your feedback and any further ideas.

MinhThieu145 commented 5 months ago

Hi @octonawish-akcodes,

I totally agree with the idea of using the functions we already have. Right now, I'm trying out some different kinds of functions for the new serverless-benchmark issue mentioned here: SEBS New Serverless Benchmarks. But really, the main idea is the same as before.

I'm all for making the most of what we've got and seeing how we can adapt those functions to fit our new needs. Let's keep in touch about how the testing goes!

mcopik commented 5 months ago

@MinhThieu145 Yes, we should reuse those functions. What I meant by the generator is that we should glue together the functions that already exist in the PR, and synthesize functions that combine different behaviors, e.g., a function that does compute, then some I/O accesses, etc.

It should be reproducible - if user specifies the same config, they should receive exactly the same function and observe the same behavior :)

MinhThieu145 commented 5 months ago

Thank you for your input, @mcopik. I've been exploring how functions work together and found a really helpful paper, ServerlessBench: ServerlessBench Paper

This paper dives deep into how serverless functions interact, which is just what we need for our project. Based on this and our existing setup, here's what I'm thinking:

Improving Our Benchmarks

We have four experiments in our toolkit right now: Current Experiments But they don't fully cover how functions flow and work with each other. The ServerlessBench paper suggests focusing on areas like:

These areas could really enhance how we measure and understand our benchmarks.

Bringing in Function Flows

ServerlessBench outlines two ways to orchestrate functions:

Ideas for Our Benchmarks

Thinking About AWS Tools

I'm actively developing these concepts and would greatly value your insights, particularly regarding the use of Step Functions and ECR. If you have any additional resources or suggestions, please feel free to share. I’m eager to hear your perspective and incorporate your feedback into our ongoing work

octonawish-akcodes commented 5 months ago

@mcopik I raised a PR #194 here, have a look

mcopik commented 5 months ago

@MinhThieu145 Thanks - yes, I know the paper, and it complements our Middleware paper in some aspects.

We already have communication performance benchmarks (unmerged branch using FMI benchmarks), and the invocation-overhead benchmark covers startup latency. Regarding the stateless execution and resource efficiency, I'm happy to hear proposals in this aspect.

Workflows - we have a branch with results from a paper in submission, and I hope we will be able to merge it soon :) It supports Step Functions, Durable Functions, and Google Cloud Workflows. I don't think we have a workflow covering typical website use cases, but adding something like this could be a good idea; there are also similar ideas for website-based workflows in #140.

ECR and containers - this is a feature we definitely need, but we should also support it on other platforms where possible (Azure also supports this).

entiolliko commented 5 months ago

@mcopik So to have a small recap we need the following type of computation:

  1. CPU Computation - Functions which make heavily use of the CPU. Ex: MMM with specified sizes
  2. GPU Computation - We could use ML Training or Torch Tensor Multiplication on the GPU. The config file could specify the model which has to be used or the size of the tensors to be multiplied
  3. Memory Allocation - The memory/python/function.py benchmarks it by allocating numpy arrays
  4. Disk Read/Write - We could dynamically generate some random text and write it on disk and then read it again and mesure the speed

Since most of these are already implemented we could add support for the config which lets you select how many loops you want for the MMM for example or more fine grained control. Any suggestions?

As for your suggestion to have a single config file, when you say CPU ops you mean the number of FLOP we make, memory ops the amount of data we store and use on RAM and storage ops the number of bytes we read and write from disk? For example if we have a specific config file we need a generated python script which has 3 functions calls inside, one for CPU ops, one for memory ops and one for storage ops? Example: fun1(input1) //CPU Intensive fun2(input2) //Memory Intensive fun3(input3) //Disk Intensive

mcopik commented 5 months ago

@entiolliko @octonawish-akcodes @MinhThieu145 Linear algebra as a replacement for CPU is a good idea; we can use LAPACK for that. It can be quite flexible. I'd put GPU as the next feature, which is a different category.

Yes, this is what I mean by it. I think the ideal result would be to have a single serverless function that does specified configuration for computations (which can, of course, be composed of many local functions).