Running a NF using go.sh is not pinning the right core

sdnfv / openNetVM

A high performance container-based NFV platform from GW and UCR.

http://sdnfv.github.io/onvm/

Other

263 stars 136 forks source link

Running a NF using go.sh is not pinning the right core #157

Closed madhura-a closed 5 years ago

madhura-a commented 5 years ago

Bug Report

Current Behavior Step 1: Run manager (./go.sh 0,1,2 0 0xf8 -s stdout -a 0x7f0000000) Step 2: Run any network function using go.sh (./go.sh 1 -d 1) Step 3: Check the core to which NF is pinned

The output of the taskset command shows that NF is pinned to core 0.

Expected behavior/code Ideally, it should bind to core 4. Running the NF using start_nf.sh is working as expected.

Environment

onvm version: latest

twood02 commented 5 years ago

@dennisafa can you look into this?

@madhura-a can you tell us what version of ONVM you are using?

dennisafa commented 5 years ago

Hi @madhura-a I'm having a bit of trouble replicating your issue. I'm using the master branch version of ONVM. Using the 0xf8 bit mask specifies that you are registering cores 3-7 for NF usage:

Running the simple_forward NF with the command you specified:

I am seeing that it is binding to core 3 as expected:

Please tell me a bit more about your environment, and if these are the steps you took to get to the bug.

madhura-a commented 5 years ago

Now just check whether the NF is actually running on core 3 or not. I have used the command taskset -c -p

madhura-a commented 5 years ago

I think there is a problem with onvm_manager stats also. It always displays the first registered core.

dennisafa commented 5 years ago

I think there is a problem with onvm_manager stats also. It always displays the first registered core.

By design, NF's are assigned to the first available core. Using htop I verified that the simple_forward NF is assigned to the proper core:

Perhaps try htop and see if you get similar results.

dennisafa commented 5 years ago

I also tried with taskset. Please use taskset -c -p -a:

Edit: There is a pthread_create call that occurs when an NF is initialized (see onvm_nflib_run: https://github.com/sdnfv/openNetVM/blob/master/onvm/onvm_nflib/onvm_nflib.c#L520 ) ,and then that thread is bound to its proper core. That's why you see core 0 by default with the taskset command.

madhura-a commented 5 years ago

Taskset with -a option displayed the core correctly. Thank you.

But in stats, even though I changed the core to some other value using -l option, it still displays core as 3. I have used the command as ./start_nf.sh speed_tester -l 5 -- -s -r 1 -- -d 1

dennisafa commented 5 years ago

You must specify the manually assigned core flag -m: -l 5 -- -s -m -r 1 -- -d 1

madhura-a commented 5 years ago

Its working. Thank you.

twood02 commented 5 years ago

@dennisafa @koolzz should we make it so that if you use -l, then -m is required?

dennisafa commented 5 years ago

@twood02 yes i agree. -l defaults to 0 when the ./go.sh script is called (see htop screenshot)

We should change that - if -l is used then we assume the user is manually assigning a core, thus -m should be used as well.

koolzz commented 5 years ago

@twood02, @dennisafa Maybe we should just print a warning. Technically a user may want to just use -l for dedicating a core which starts the NF (f.e when core 0 should be left completely unused as its running intensive computation), as it runs some basic dpdk startup before moving to a different core. Although maybe thats overthinking it