faucetsdn / daq

DEPRECATED -- DAQ (Device Automated Qualification) framework in no longer in use, supported, or maintained. It is here for archival purposes only.
Apache License 2.0
40 stars 32 forks source link

Connection problems with ethernet adapter and v1.5/1.5.1 #493

Open noursaidi opened 4 years ago

noursaidi commented 4 years ago

I've been network issues using my built ethernet adapter to connect to a test device using v1.5/1.5.1. The device connected does not obtain an IP address and does not have internet connectivity. This is occurs after a DAQ run is initiated, and persists even after DAQ aborted and the system rebooted.

Whilst DAQ is running, it hangs at the start of the main event loop "System port 1 on dpid is active True".

The only way I have found to restore connectivity was to run bin/net_clean, which immediately after, the device will regain network connectivity. I've tried rebooting multiple times, connecting/disconnecting, removing and modifying wired connection settings in nm-connection-editor.

I've reverted back to 1.4 to test (using git reset), and did not encounter this issue.

My laptop is running Ubuntu 18.04. I've tested connecting devices running Ubuntu Server, Raspbian and Windows 10, all of which do not have internet connection.

The output of bin/techsupport is attached - techsupport.zip

grafnu commented 4 years ago

Two things going on here, not sure exactly what the problem is.

First, I'm confused by the parts of your question when you talk about the device having an internet connection. What, exactly, do you mean by this and what are you expecting? It sounds like you're trying to get some expected behavior outside of DAQ (you talk about the device obtaining an IP address, and persisting after DAQ is aborted). And then also running nm-connection-editor. What are you trying to do (like, why are you doing this)? My guess is you are trying to do too much and maybe causing some problems by telling Debian to do something with the adapter outside of what DAQ is doing. At the very least what happens outside of when DAQ is running doesn't matter, and you shouldn't have to be fiddling with the connection manager or anything to fix it (DAQ).

Second, it's likely that something got messed up with the startup sequence for what DAQ is doing. Looking at the logs, it's starting up a faux device on the ethernet adapter you supplied, as well as trying to connect it externally to the switch. Can you try changing your config to look something more like:

interfaces: enp3s0f1: port: 1

There's basically one line in there that says "opt:" and try it as "port: 1" -- the new setup considers the "opt" as a trigger to startup the faux container, while "port" is the trigger for connecting to an external device (the port number is just the internal port it's connected on, so the value 1 should be fine).

Cheers, Trevor

On Sun, Jun 21, 2020 at 6:19 AM Noureddine notifications@github.com wrote:

I've been network issues using my built ethernet adapter to connect to a test device using v1.5/1.5.1. The device connected does not obtain an IP address. This is occurs after a DAQ run is initiated, and persists even after DAQ aborted and the system rebooted.

Whilst DAQ is running, it hangs at the start of the main event loop "System port 1 on dpid is active True".

The only way I have found to restore connectivity was to run bin/net_clean, which immediately after, the device will regain network connectivity. I've tried rebooting multiple times, connecting/disconnecting, removing and modifying wired connection settings in nm-connection-editor.

I've reverted back to 1.4 to test (using git reset), and did not encounter this issue.

My laptop is running Ubuntu 18.04. I've tested connecting devices running Ubuntu Server, Raspbian and Windows 10, all of which do not have internet connection.

The output of bin/techsupport is attached - techsupport.zip https://github.com/faucetsdn/daq/files/4809510/techsupport.zip

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/faucetsdn/daq/issues/493, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIEPD5ROJJ2UNRI2UDS5H3RXYCF7ANCNFSM4OD3ND7A .

noursaidi commented 4 years ago

Thanks Trevor the prompt response.

I tried with with ’port’ : 1 fixed it for me. I had to run bin/net_clean though. Thank you - all working now. I’ll update the Wiki so there’s an example of using a ethernet adapter, as the current example is using .conf example which is now outdated. I had that configuration with the ‘opt’ because that’s what worked for me on 1.4 and previous version.

To also answer some of the questions you had:

First, I'm confused by the parts of your question when you talk about the device having an internet connection. What, exactly, do you mean by this and what are you expecting?

  • Just to clarify, my laptop running DAQ did not lose internet connection (through WIFI), this still worked
  • I’m sharing the WIFI internet connection on my ethernet port. When I plug a device into my laptop, it should connect to the internet. The device should receives an IP address (which would be reported by running ifconfig on the device).
  • After upgrading to DAQ v1.5 and running DAQ once, this stopped happening at all (unless I run bin/net_clean)
  • After upgrading to DAQ v1.5, I was no longer able to run any tests

running nm-connection-editor. What are you trying to do (like, why are you doing this)? My guess is you are trying to do too much and maybe causing some problems by telling Debian to do something with the adapter outside of what DAQ is doing.

These were some troubleshooting steps before I realised it was something DAQ had done. I thought something had gone with my network adapter or network settings. I remember when I first setup DAQ I had to set ethernet connection to be “shared with other computers” to get it to work, so I was repeating this.

I’ve also taken a look at net_clean and put pauses between each action it done to see which step fixed it for me. After cleaning docker images (which are reported to be):

daqf/faucet
daqf/gauge
daqf/aardvark

The “running” flag on ifconfig on the device disappears for a second and comes back. Sometimes the device will get an IP and internet connection back at this stage, but not always. If not, it seems to happen after “cleaning ovs”

grafnu commented 4 years ago

Ok -- great. Yah, something changed with how the automatic startup things were configured and your particular case wasn't caught in the testing. Fortunately sounds like it's an easy fix!

The .conf syntax should still work -- it's just that it would need to be interfaces.XXXXX.port=2 rather than interfaces.XXXXX.opt= -- same meaning, different syntax!

So, what's likely going on is that when DAQ is running it will remove the network interface from the "normal" network namespace of the Debian host. So, nothing about the normal setup matters, and then when it's in the DAQ namespace it doesn't use the normal stuff. It's basically in two different worlds. The net_clear "fixes" it because it makes sure that the interface isn't in the DAQ world anymore. It is somewhat of a complicated world since what DAQ is doing is "technically legal" according to linux and basics of operations, it's "unconventional" and so things like Debian will sometimes try to do things to "fix" what isn't broken! Hopefully once it's stable these problems don't come up too much.

Cheers, Trevor

On Sun, Jun 21, 2020 at 9:37 AM Noureddine notifications@github.com wrote:

Thanks Trevor the prompt response.

I tried with with ’port’ : 1 fixed it for me. I had to run bin/net_clean though. Thank you - all working now. I’ll update the Wiki so there’s an example of using a ethernet adapter, as the current example is using .conf example which is now outdated. I had that configuration with the ‘opt’ because that’s what worked for me on 1.4 and previous version.

To also answer some of the questions you had:

First, I'm confused by the parts of your question when you talk about the device having an internet connection. What, exactly, do you mean by this and what are you expecting?

  • Just to clarify, my laptop running DAQ did not lose internet connection (through WIFI), this still worked
  • I’m sharing the WIFI internet connection on my ethernet port. When I plug a device into my laptop, it should connect to the internet. The device should receives an IP address (which would be reported by running ifconfig on the device).
  • After upgrading to DAQ v1.5 and running DAQ once, this stopped happening at all (unless I run bin/net_clean)
  • After upgrading to DAQ v1.5, I was no longer able to run any tests

running nm-connection-editor. What are you trying to do (like, why are you doing this)? My guess is you are trying to do too much and maybe causing some problems by telling Debian to do something with the adapter outside of what DAQ is doing. These were some troubleshooting steps before I realised it was something DAQ had done. I thought something had gone with my network adapter or network settings. I remember when I first setup DAQ I had to set ethernet connection to be “shared with other computers” to get it to work, so I was repeating this.

I’ve also taken a look at net_clean and put pauses between each action it done to see which step fixed it for me. After cleaning docker images (which are reported to be):

daqf/faucet

daqf/gauge

daqf/aardvark

The “running” flag on ifconfig on the device disappears for a second and comes back. Sometimes the device will get an IP and internet connection back at this stage, but not always. If not, it seems to happen after “cleaning ovs”

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/faucetsdn/daq/issues/493#issuecomment-647151609, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIEPD62DERMUUTPGY3GXK3RXYZNDANCNFSM4OD3ND7A .