sdnfv / openNetVM

A high performance container-based NFV platform from GW and UCR.
http://sdnfv.github.io/onvm/
Other
263 stars 135 forks source link

DPDK version update #231

Closed Vivek-anand-jain closed 4 years ago

Vivek-anand-jain commented 4 years ago

Update ONVM to use the latest version of DPDK

Summary:

Usage:

This PR includes
Resolves issues [x] Fixes https://github.com/sdnfv/openNetVM/issues/226
Breaking API changes
Internal API changes
Usability improvements
Bug fixes
New functionality
New NF/onvm_mgr args
Changes to starting NFs
Dependency updates [x] Updates the DPDK version to v20.05
Web stats updates

Merging notes:

TODO before merging :

Test Plan:

Review:

Note:

onvm commented 4 years ago

In response to PR creation

CI Message

Aborting, need an authorized user to run CI

kevindweb commented 4 years ago

@onvm test again?

onvm commented 4 years ago

@onvm test again?

CI Message

Your results will arrive shortly

kevindweb commented 4 years ago

Ok, @Vivek-anand-jain thanks for working on this! Yes, the build took almost 10 minutes longer. I watched it, it took DPDK 10 minutes to compile. Not sure how, that's crazy. Also, I'm moving your upstream branch to develop (we usually push to develop first, not master). There didn't seem to be performance hiccups, but we need to extensively test this I think, because we only test speed_tester and bsc_monitor in CI.

kkrama commented 4 years ago

Glad to see that it at least passed the first step in terms of the simple test. It would be worthwhile to see how the rest of the ONVM environment with the different NFs works with this version of DPDK. If it does, it will be good.

On Sat, Jun 13, 2020 at 6:18 PM Kevin Deems notifications@github.com wrote:

Ok, @Vivek-anand-jain https://github.com/Vivek-anand-jain thanks for working on this! Yes, the build took almost 10 minutes longer. I watched it, it took DPDK 10 minutes to compile. Not sure how, that's crazy. Also, I'm moving your upstream branch to develop (we usually push to develop first, not master). There didn't seem to be performance hiccups, but we need to extensively test this I think, because we only test speed_tester and bsc_monitor in CI.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/sdnfv/openNetVM/pull/231#issuecomment-643703832, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC6TUJOXF73KBH36VXIAZNTRWQQNVANCNFSM4N5GUPZA .

-- K. K. Ramakrishnan

kevindweb commented 4 years ago

@onvm did the updates work?

onvm commented 4 years ago

@onvm did the updates work?

CI Message

Your results will arrive shortly

Vivek-anand-jain commented 4 years ago

@kevindweb Thanks for initiating the build and running the tests. I also observe that DPDK is taking too much time to build. This might be because of some additional features are now enabled by default. We have to check the Release Notes to get more clarity.

kevindweb commented 4 years ago

@onvm this should be much faster now

onvm commented 4 years ago

@onvm this should be much faster now

CI Message

Your results will arrive shortly

twood02 commented 4 years ago

@Vivek-anand-jain can you summarize the changes you needed to make for this? It looks like almost all of it is just renaming structs. Were there any function API changes you had to deal with? Any setup issues other than changing the driver config?

Currently the linter is failing possibly because our original code had bad formatting. @bdevierno1 / @kevindweb -- does the linter only flag lines that were modified in this PR? Or does it check all files modified in the PR? If there are modified lines here that have linting errors we should fix them, but we don't need to go out of our way to fix all old formatting errors in these files. (I'd like this PR to just be about fixing the DPDK update, not format issues)

twood02 commented 4 years ago

@kevindweb - I'd like you to think about what tests we can run to verify correctness and performance stability.

@sreya519 - I'd like you to test this in your updated ubuntu 20.04 cloudlab profile (once you have the basics working)

bdevierno1 commented 4 years ago

Currently the linter is failing possibly because our original code had bad formatting. @bdevierno1 / @kevindweb -- does the linter only flag lines that were modified in this PR? Or does it check all files modified in the PR? If there are modified lines here that have linting errors we should fix them, but we don't need to go out of our way to fix all old formatting errors in these files. (I'd like this PR to just be about fixing the DPDK update, not format issues)

Yes the linter only picks up lines that have been modified.

kkrama commented 4 years ago

I'd like for us to have - 1) a set of tests that can verify that when someone brings a new NF in, it is safe and can co-exist with other NFs in a system that is already set up. This NF should be a good citizen (we don't care what it does internally) by only touching memory it is supposed to, not crash the system, interface with the NF manager correctly, can verify what parameters it needs in terms of batching with the Tx/Rx threads etc. 2) when we have a change to the ONVM infrastructure (DPDK interfacing, adding functionality in the NF Manager, scheduler etc. etc.), we have a set of NFs and NF chains that we can run a test suite of packet traces/pre-specified packet generation sequences with a pre-specified 'passing criteria', so that we can have a more automated way of embracing changes to the ONVM infrastructure. K. K.

On Tue, Jun 16, 2020 at 7:43 PM Benjamin De Vierno notifications@github.com wrote:

Currently the linter is failing possibly because our original code had bad formatting. @bdevierno1 https://github.com/bdevierno1 / @kevindweb https://github.com/kevindweb -- does the linter only flag lines that were modified in this PR? Or does it check all files modified in the PR? If there are modified lines here that have linting errors we should fix them, but we don't need to go out of our way to fix all old formatting errors in these files. (I'd like this PR to just be about fixing the DPDK update, not format issues)

Yes the linter only picks up lines that have been modified.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/sdnfv/openNetVM/pull/231#issuecomment-645113629, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC6TUJI6WPKMHQ2KTQMIHJ3RXAUVTANCNFSM4N5GUPZA .

-- K. K. Ramakrishnan

Vivek-anand-jain commented 4 years ago

@Vivek-anand-jain can you summarize the changes you needed to make for this? It looks like almost all of it is just renaming structs. Were there any function API changes you had to deal with? Any setup issues other than changing the driver config?

@twood02 I updated the PR description to summarize all the changes. I had to change the rte_pdump_init(NULL); to rte_pdump_init();. I didn't observe any setup issue other than driver config that I addressed by updating install.sh.

Do you want me to fix linter issues? Do we use some specific tools (e.g. clang-format, clang-tidy, uncrustify) for fixing the linter issues? Can we have something like make lint or make lint --fix to run linter operation locally?

sreyanalla commented 4 years ago

Hi @Vivek-anand-jain! Which version of Ubuntu have you been using to run/test this PR?

kevindweb commented 4 years ago

@twood02 in the past we have ignored long lines. The shell issue is a normal shellcheck thing, just need to put quotes around $RTE_SDK. The C files depend on your call, it might not look as clean to split the lines with \ to get the line length down, but that would be the only issues involved. If you don't want format issues involved, @bdevierno1 do you know if we can still force merge and avoid the linters for this PR?

kevindweb commented 4 years ago

@twood02 I just saw your second comment about performance. I'm going to reference you here from our last dpdk update over a year ago. CI checked the basics, we have the linter checks already done. Here's a few suggestions for things researchers can run for sanity and performance: 1) Run probably 6-7 speed_testers at once, constantly turning them on and off to make sure memory wasn't bleeding dry 2) Run Pktgen/bsc_monitor on preferably 2 ports at once (something CI doesn't do, we only do one there) 3) Test scaling, making sure it can scale up and down -- also, run the firewall NF with config files 4) Test a large NF chain that uses multiple different NF types

These are all things that should be done in a completely random order, validating we don't have major performance issues or hidden bugs. The manager should be able to handle all of this, as well as some unusual arguments (not just the normal ./go.sh 0,1,2,3 3 0xF0 -s stdout). When we run like this, we could potentially see drops, segfaults, and just in general reach all the parts of the manager that might not be hit during an average run.

Vivek-anand-jain commented 4 years ago

Hi @Vivek-anand-jain! Which version of Ubuntu have you been using to run/test this PR?

Hi @sreya519 , Sorry I missed this message, I used the Ubuntu 14.04 (one in Cloudlab profile).

bdevierno1 commented 4 years ago

I was noticing some errors after updating DPDK when running the makefile in openNetVM/tools/Pktgen/pktgen-dpdk to install pktgen. @sreyanalla did you have any issues?

sreyanalla commented 4 years ago

I was noticing some errors after updating DPDK when running the makefile in openNetVM/tools/Pktgen/pktgen-dpdk to install pktgen. @sreyanalla did you have any issues?

@bdevierno1 I haven’t seen any issues specifically. Just so I can try to replicate this issue, is it a result of building pktgen?

bdevierno1 commented 4 years ago

I was noticing some errors after updating DPDK when running the makefile in openNetVM/tools/Pktgen/pktgen-dpdk to install pktgen. @sreyanalla did you have any issues?

@bdevierno1 I haven’t seen any issues specifically. Just so I can try to replicate this issue, is it a result of building pktgen?

Yeap. I believe I just ran make in the pktgen-dpdk directory.

twood02 commented 4 years ago

@sreyanalla is making a cloudlab profile to test this + Ubuntu 20 update and will give it to @EthanBaron14 and @catherinemeadows for testing with their 3 node tutorial

twood02 commented 4 years ago

@Vivek-anand-jain -- thanks for making this Pull Requst! We've included your commits in #249 and will merge that since it also includes pktgen and ubuntu updates.