some questions about the ChainNode

wwt02 commented 2 years ago

Dear Austin Pahl. A few days earlier I had read you paper----- Sequoia: Enabling Quality-of-Service in Serverless Computing. But I found some questions when I wanted to do the experiment about the paper. The code as follows: node3_publish_fan7 = ChainNode(function=publish_compile, nodeID=3, children=[], lastNodeIDs=[2, 3, 4, 5, 6, 7, 8], chainFunctionIDs=[200, 201], args = { "s3_input": "repos/fzf.tar.gz", "s3_output": "releases/fzf_test/fzf_arm64", "s3_bucket": "code-publish-bucket", "arch": "arm64", }) The filename of code is producer_workload_burst.py. I feel that the value of chainFunctionIDs should be [200,201,201,201,201,201,201,201]. In the meanwhile, I’m not sure whether this idea is right. I hope that I can get your help.

AustinRP commented 2 years ago

Hi Wentao, thanks for reaching out. In this code, chainFunctionIDs contains a set of all of the unique function IDs that appear anywhere in the chain. Each unique ID should only appear once in chainFunctionIDs, so that is why we use [200, 201] instead of [200, 201, 201, ...]. We used this set later inside of serial_consumer.py to help implement some of the fair share rate limiting algorithms.

Let me know if that clarifies it or if you'd like to discuss more.

wwt02 commented 2 years ago

Dear Austin Pahl. If I accept your explanation, I have another question. The code as follows: node6_c3 = ChainNode(function=lambda4, nodeID=6, children=[], lastNodeIDs=[6,2], chainFunctionIDs=[1,2,3,4,4,4], args={}) The filename of code is poisson_producer.py. I think that the value of chainFunctionIDs should be [1,2,3,4]. I hope that I can get your help.

AustinRP commented 2 years ago

Good catch. It's been over a year since I've worked on this project, so I am refreshing my memory too 🙂. From taking another look at our fair share logic, the number of occurrences of each function ID is counted and used to proportionally weight each function limit. For example, if we have chainFunctionIDs=[1, 2, 2], Function 1 will get 1/3 of the overall limit and Function 2 will get 2/3 of the overall limit.

I believe there were two slightly different use cases we were testing in these examples.

The case where we have chainFunctionIDs=[200, 201] is splitting the overall limit equally between the two function types even though there are significantly more 201s that appear in the chain. I'm not confident, but I think we were trying to stress test Sequoia's rate limiting.
The case where we have chainFunctionIDs=[1,2,3,4,4,4] was trying to split the overall limit according to number of appearances in the chain. So Function 4 gets 1/2 of the overall limit and Functions 1 to 3 each get 1/6.

Hope that helps!

CU-BISON-LAB / sequoia

some questions about the ChainNode #1