cncf / cluster

🖥🖥🖥🖥CNCF Community Cluster
https://cncf.io/cluster
155 stars 42 forks source link

OVN data plane performance testing #26

Closed russellb closed 7 years ago

russellb commented 7 years ago

If you are interested in filing a request for access to the CNCF Community Cluster, please fill out the details below.

If you are just filing an issue, ignore/delete those fields and file your issue.

First Name

Russell

Last Name

Bryant

Email

rbryant@redhat.com

Company/Organization

Red Hat

Job Title

Senior Principal Software Engineer

Project Title

OVN Data Plane Testing

What existing problem or community challenge does this work address? ( Please include any past experience or lessons learned )

Networking data plane performance is very important, and it's often difficult to know what performance characteristics to expect with different combinations of hardware, software, and protocols in use.

Briefly describe the project

We aim to do some benchmarking of OVN (from the Open vSwitch community) using the Geneve protocol as compared to other solutions. This lab has Intel X710 NICs, which support both Geneve and VXLAN offload, making it an ideal place to do this work.

Do you intend to measure specific metrics during the work? Please describe briefly

Yes, we will gather networking data plane performance metrics.

Which members of the CNCF community and/or end-users would benefit from your work?

Anyone interested in overlay network based solutions, particularly those based on OVS, would benefit from the additional insights gained by this work.

Is the code that you’re going to run 100% open source? If so, what is the URL or URLs where it is located?

Yes. OVN is a part of Open vSwitch and the code is hosted at http://github.com/openvswitch/ovs. We aim to test OVN as configured by OpenStack (http://github.com/openstack/networking-ovn). We are also interested in testing with Kubernetes (http://github.com/openvswitch/ovn-kubernetes).

Do you commit to publishing your results and upstreaming the open source code resulting from your work? Do you agree to do this within 2 months of cluster use?

Yes.

Will your testing involve containers? If not, could it? What would be entailed in changing your processes to containerize your workload?

Testing of ovn-kubernetes would involve containers.

Are there identified risks which would prevent you from achieving significant results in the project ?

Any encountered lab network or other hardware issues are the main risks.

Have you requested CNCF cluster resources or access in the past? If ‘no’, please skip the next three questions.

No

Please list project titles associated with prior CNCF cluster usage.

Please list contributions to open source initiatives for projects listed in the last question. If you did not upstream the results of the open source initiative in any of the projects, please explain why.

Have you ever been denied usage of the cluster in the past? If so, please explain why.

Please state your contributions to the open source community and any other relevant initiatives

I have been contributing to various open source projects since 2004. In the last couple of years I have been working on OVN (Open Virtual Network), a new network virtualization system developed by the Open vSwitch community that has been integrated with multiple systems, including OpenStack and Kubernetes.

Number of nodes requested (minimum 20 nodes, maximum ~400 nodes).

20 (or less is fine, probably 5 minimum)

Preferred node flavor, ratio if mixed (compute, storage, any).

any (primary requirement is the Intel X710 NIC, which appears to be in both compute and storage flavors)

Duration of request (minimum 24 hours, maximum 2 weeks).

2 weeks

With or Without an Operating System (restricted to CNCF predefined OS and versions as in README)?

CentOS 7, though ideally also with the option to re-provision the machines using an OpenStack deployment tool.

How will this testing advance cloud native computing (specifically containerization, orchestration, microservices or some combination).

OVN is a new network virtualization option that works with Kubernetes. Having a better understanding of the performance of OVN and its use of Geneve will help better inform choices and further development in this area.

Any other relevant details we should know about while preparing the infrastructure?

No

bprestonlf commented 7 years ago

+1

caniszczyk commented 7 years ago

LGTM cc: @cncf/intel-cluster-team

cncfclusterteam commented 7 years ago

Hi @russellb , we will provide you with 20 CentOS7 nodes this week. Feel free to re-provision them later on. We'll let you know once the nodes are ready for use.

cncfclusterteam commented 7 years ago

We have one question - would you like the nodes to be distributed among any number of racks?

russellb commented 7 years ago

I'm not familiar with the lab layout, but'd I'd say all on the same rack would be my preference, but across multiple racks is fine too if needed.

Thank you!

kimbarbel commented 7 years ago

On behalf of @cncf/intelclusterteam: Requested nodes are prepared. Details are in shared google docs that were sent to you.

russellb commented 7 years ago

Thank you!

russellb commented 7 years ago

If someone else will be helping me, I assume we should get them their own VPN credentials?

Numan Siddique @numansiddique nusiddiq@redhat.com

kimbarbel commented 7 years ago

Hi, I am working on VPN credentials for @numansiddique, meanwhile Joe Talerico from RedHat from RedHat send me request to access RH_20nodes_Access_for_CNCF spreadsheet in google docs. Do you authorized Joe Talerico to have access to that spreadsheet ?

kimbarbel commented 7 years ago

On behalf of @cncf/intelclusterteam: VPN access for @numansiddique is done. Details are in shared google docs that were sent to @numansiddique.

numansiddique commented 7 years ago

Thanks for the granting the access. Looks like the nodes are provisioned with ubuntu. Is it possible to reprovision with Centos ?

Thanks Numan

cncfclusterteam commented 7 years ago

Hi Numan, we are sorry for the mistake. We will reprovision them with CentOS as soon as possible but it might prolong till tomorrow. We'll let you know once they are ready.

numansiddique commented 7 years ago

Thank you. I will wait till then

cncfclusterteam commented 7 years ago

@numansiddique we already managed to reprovision them, please check if everything's fine.

russellb commented 7 years ago

Yes, I authorize Joe Talerico to have full access to this environment and related documentation.

numansiddique commented 7 years ago

Thanks for re provisioning it. Its looks fine.

Numan

On Thu, Dec 22, 2016 at 10:35 PM, cncfclusterteam notifications@github.com wrote:

@numansiddique https://github.com/numansiddique we already managed to reprovision them, please check if everything's fine.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cncf/cluster/issues/26#issuecomment-268844171, or mute the thread https://github.com/notifications/unsubscribe-auth/AGRHU2eFTbGmK47fH2SCMACyLm0g5X8Jks5rKq3qgaJpZM4LPei8 .

russellb commented 7 years ago

We're making good progress in this environment. Thanks again!

Would it be possible to extend our allocation through to the end of January? I see in #20 that we're set for mid-January, but the rest of the month would give us more ability to do analysis on the results we're getting.

Thanks!

cncfclusterteam commented 7 years ago

@russellb we're glad to hear that you're making progress!

As mentioned in #28 we are able to push the return date a week forward but until we are sure about the precise date for maintenance window, we cannot guarantee that the environment will be stable until the end of January.

russellb commented 7 years ago

That works for me. Thank you!

russellb commented 7 years ago

Hello, @cncfclusterteam ! We're still making very good use of this allocation. Thanks again!

I was curious if you had any update on the planned maintenance and when we should expect to end our work? I'm just wondering if we can safely plan out and execute another week's worth of work, or if we should be going day-by-day, or what's best.

Thanks!

cncfclusterteam commented 7 years ago

Hi @russellb , we are still investigating the timeline for the maintenance window but we are certain that it will not happen until the end of next week, that is until Feb 3rd. I will update the corresponding issue to make that clear for everyone.

russellb commented 7 years ago

@cncfclusterteam Thanks for the schedule update!

Is there a known network outage in the lab right now? We've had trouble connecting to our nodes in the last hour or so. It seemed to come back for a few minutes, but we're not able to connect again.

numansiddique commented 7 years ago

@cncfclusterteam We are unable to connect to the nodes. Any update on this would be very helpful.

Thanks

cncfclusterteam commented 7 years ago

@russellb @numansiddique thank you for reporting, we are investigating the issue. We are also sending you an email with the support e-mail address where we would like you to report future problems with the infrastructure.

Thank you!

cncfclusterteam commented 7 years ago

Hi @russellb ,

We hope the time spent with the cluster has been productive. I am writing to inform you that we would like to clean up the nodes for next tenants. Please let us know when we can take them back to the free pool.

Please also notify us here when there are any follow-up articles available.

Thank you, CNCF Cluster Team

numansiddique commented 7 years ago

Hello CNCF team,

Thank you very much again. I am in middle of one small testing and I will be done by tomorrow i.e 21st. Please let me know if it is possible. Even otherwise it's fine :)

Thanks again. Numan

cncfclusterteam commented 7 years ago

Tomorrow will be fine, thanks for the quick response!

numansiddique commented 7 years ago

I am done with the testing. Thanks again You can go ahead with the cleanup.

Numan