cncf / cluster

🖥🖥🖥🖥CNCF Community Cluster
https://cncf.io/cluster
154 stars 38 forks source link

Cluster for testing RethinkDB with added NUMA-awareness #38

Closed thedrow closed 7 years ago

thedrow commented 7 years ago

If you are interested in filing a request for access to the CNCF Community Cluster, please fill out the details below.

If you are just filing an issue, ignore/delete those fields and file your issue.

First Name

Omer

Last Name

Katz

Email

omer.drow@gmail.com

Company/Organization

N/A

Job Title

N/A

Project Title

Adding NUMA-aware allocator to RethinkDB's buffer cache

What existing problem or community challenge does this work address? ( Please include any past experience or lessons learned )

Databases have high memory throughput even in clusters of modest sizes. Data is usually cached in memory and the faster it is read the faster the database is. On NUMA-enabled hardware, databases can swap in a way that greatly degrades performance. See https://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/

Briefly describe the project

The project will use the memkind to introduce NUMA aware allocations to the buffer cache.

Do you intend to measure specific metrics during the work? Please describe briefly

Yes, I'm going to use the YCSB benchmark provided here. This will ensure everything is working normally and that performance is enhanced.

Which members of the CNCF community and/or end-users would benefit from your work?

The RethinkDB project and its users

Is the code that you’re going to run 100% open source? If so, what is the URL or URLs where it is located?

Yes. https://github.com/rethinkdb/rethinkdb

Do you commit to publishing your results and upstreaming the open source code resulting from your work? Do you agree to do this within 2 months of cluster use?

Yes absolutely. A PR will be issued to the RethinkDB repository.

Will your testing involve containers? If not, could it? What would be entailed in changing your processes to containerize your workload?

I do not need containers. I may test RethinkDB on containers as well to see the effect of my changes on the performance of containerized RethinkDB.

Are there identified risks which would prevent you from achieving significant results in the project?

Not that I know of.

Have you requested CNCF cluster resources or access in the past? If ‘no’, please skip the next three questions.

No

Please list project titles associated with prior CNCF cluster usage.

Please list contributions to open source initiatives for projects listed in the last question. If you did not upstream the results of the open source initiative in any of the projects, please explain why.

Have you ever been denied usage of the cluster in the past? If so, please explain why.

Please state your contributions to the open source community and any other relevant initiatives

I maintain Celery a task manager written in Python, MongoEngine which maps Mongo documents to Python objects and oauthlib which is the standard OAuth1/2 implementation for Python.

Number of nodes requested (minimum 20 nodes, maximum ~400 nodes).

20

Preferred node flavor, ratio if mixed (compute, storage, any).

compute

Duration of request (minimum 24 hours, maximum 2 weeks).

2 weeks if possible. I'd like to repeatedly refine the performance and ensure we're getting the most out of the hardware.

With or Without an Operating System (restricted to CNCF predefined OS and versions as in README)?

Ubuntu.

How will this testing advance cloud native computing (specifically containerization, orchestration, microservices or some combination).

Any other relevant details we should know about while preparing the infrastructure?

I need both machines that are NUMA-enabled and machines that are not.

caniszczyk commented 7 years ago

we have a whole new process for this with CIL: https://www.cncf.io/community/infrastructure-lab/

if you're interested, please open up a new issue, there's a new issue template