Closed koolzz closed 4 years ago
I'm having some trouble replicating this bug. Using latest version of dev branch, running manager with and speed_tester with the commands used above, I am getting no prints on either the manager or NF side. Where's the port parsing happening? Having trouble finding where that print statement is.
@dennisafa If I recall correctly this print happens because when we run speed tester it tries to create packets using some mac addr from the ports array. The print would happen in the setup portion of speed tester launch.
Let me know if you were able to replicate @dennisafa. Probably don't solve this though, it's reserved for first time contributors
Let me know if you were able to replicate @dennisafa. Probably don't solve this though, it's reserved for first time contributors
wasn't able to : ( not getting any invalid prints on either the mgr or NF. @kevindweb can you sanity check me?
I am instantiating a new cloudlab instance to get this again, but I have received this a lot actually, both in cloudlab and nimbus. I never looked into why, but I'll try to replicate.
I mean I know why this is happening. Not too difficult to replicate, basically you just run normal manager + speed_tester with no dpdk devices bound (easily produced with a new cloudlab node). I got it on my first try, then bound a device, it went away. I unbound that device, the error came back.
This is just a dpdk issue. Inside /dpdk/lib
I tried grep -nr -I . -e "Invalid port id %d"
(using -I to ignore binary files, because they're not helpful)
./librte_kni/rte_kni.c:447: RTE_LOG(ERR, KNI, "Invalid port id %d\n", port_id);
./librte_kni/rte_kni.c:467: RTE_LOG(ERR, KNI, "Invalid port id %d\n", port_id);
Here's what we need to do, if the user doesn't have a DPDK bound interface, and they try to set a port, we should stop them, or at least show a warning. I tried digging into the dpdk code, but this solution would be easier than changing any actual logic.
Hi,
Could I have a try at this?
Would adding extra checks in openNetVM/onvm/go.sh before starting the manager be a suitable place for this? I was thinking of adding a check to verify if the environment variable ONVM_NIC_PCI
has been set first, and if yes, check if the kernel driver in use is one of the supported DPDK drivers as mentioned in the dpdk-devbind.py script.
Is there any simpler/better way of doing this?
Hi,
Could I have a try at this?
Would adding extra checks in openNetVM/onvm/go.sh before starting the manager be a suitable place for this? I was thinking of adding a check to verify if the environment variable
ONVM_NIC_PCI
has been set first, and if yes, check if the kernel driver in use is one of the supported DPDK drivers as mentioned in the dpdk-devbind.py script.Is there any simpler/better way of doing this?
Sorry I didn't see this. That sounds like a great idea. We don't want to stop the user from running without a NIC everytime. We only want it to stop when the user tries to use a specific number of ports without them being available. If the user runs go.sh with 2 ports, both NICs should be bound for example.
This is just a dpdk issue. Inside
/dpdk/lib
I triedgrep -nr -I . -e "Invalid port id %d"
(using -I to ignore binary files, because they're not helpful)./librte_kni/rte_kni.c:447: RTE_LOG(ERR, KNI, "Invalid port id %d\n", port_id); ./librte_kni/rte_kni.c:467: RTE_LOG(ERR, KNI, "Invalid port id %d\n", port_id);
You seem to have missed an '=' in the grep statement. The actual output we get is:
Invalid port_id=0
The issue seems to be the call to rte_eth_macaddr_get
while creating pkts (nf_setup
) in scaling and speed_tester NFs.
I think removing that call in the NFs and just faking a MAC address would be the best thing to do here. Would that be fine?
The tough thing is that we only want to do your suggestion if there is no nic bound. So we really want to do something like "try" to rte_eth_macaddr_get
and if it fails, set a fake MAC address. This is because if it would have failed, it means there is no NIC bound. If there is a NIC bound, we might want to use -m <dest_mac_address>
to send out, and therefore cannot use a fake MAC as the source.
We could do something in the example go.sh
to stop speed_tester from using a destination MAC if there is no NIC bound? Not sure the best way to go about that though.
Ah sorry didn't notice the -m
option. It is actually possible to 'try' to get the mac address.
Based on how the dpdk lib throws the error, we could check if a NIC is bound or not by:
if (rte_eth_dev_is_valid_port(0)) {
rte_eth_macaddr_get(0, &ehdr->s_addr);
}
I tested this and I no longer get the Invalid port error output. We'd also need an else condition here to fake the MAC address if a NIC is not bound. I hope I'm not missing anything.
Yup, sounds like a great idea. The others may correct me if I'm wrong, but if we look at the rte_eth_macaddr_get uses in the repo, multiple examples use it, not just speed_tester/scaling. We might want to make a macro called onvm_get_macaddr
for example that checks beforehand and if it fails, assigns the fake address? Instead of just changing it in speed_tester, we change all the occurrences in the repo.
The else statement should assign a dummy address, which can come from this article. I think any MAC with the prefix 02
is unassigned in the vendor prefix list.
So, onvm_macaddr_get
would check if a NIC is bound and if yes, return the MAC address, else return a fake safe MAC address?
What would be a good place to add this?
Maybe onvm_nflib/onvm_common
? We want it to be really quick because it's called in pkt_handler
in load_balancer.c
, which is why I suggested the macro.
This issue describes a simple bug/improvement for the OpenNetVM code, and is a great way for YOU to get involved with improving this open source project! If you want to fix it, just create a pull request with the fix as detailed in the Contributing document. If you have questions, post here!
Bug Report
Current Behavior Currently if onvm_mgr runs with 0 ports and the Scaling or Speed Tester NF is used we will get an annoying print:
Expected behavior/code There is no reason for us to even parse the actual port, we should just have a fake mac and stuff.
Steps to reproduce onvm_mgr ->
./go.sh 0,1,2,3 0 0xf0 -s stdout -a 0x7f000000000
scaling_example_nf ->./go.sh 1 -d 1 -n 3
Possible Solution Fake the mac if no ports enabled, or even fake the mac always