niksu / pax

Framework for prototyping network elements
Apache License 2.0
4 stars 1 forks source link

Refactor and extend NAT with UDP support #12

Closed TMVector closed 8 years ago

TMVector commented 8 years ago

This PR is a neatened and improved version of #9.

Refactor the NAT to reflect the approach of an applications software engineer. Also add support for UDP, timeouts, TCP state, and TIME_WAIT.

niksu commented 8 years ago

Thanks for tidying up the PR, I'll make another pass through the code to give any remaining feedback. I'm interested in the workflow you followed -- how did you go about tidying it up? Did you fuse commits together and/or make new ones, or rearranged existing ones? Was cherrypick sufficiently good for your purposes or did you use a different tool?

niksu commented 8 years ago

Another workflow-related question: how did you go about testing the features added in this PR, to check that UDP, timeouts, and TCP state are handled well? Did you use a small set of system tests, or did you use fuzzing to test a larger space of parameters? Thanks!

TMVector commented 8 years ago

I cherrypicked the commits that were okay from the first PR, but since the commits weren't very granular, I thought it best to focus on fixing the problems than isolating changes into separate commits. I checked out the working directory from #9 and made my changes, and pretty much committed changes as I normally would.

I'm afraid testing was pretty limited. I used the tests that are in the python scripts, and while I was creating the tests I did a fair bit of manually playing around.

niksu commented 8 years ago

I did a quick pass so far and thought I'd send some high-level feedback. I'll do a closer pass soon. The code looks in great shape. On a pedantic note, I noticed that you removed trailing whitespace :) Please also ensure that there's a single space between def blocks in the Python code, and add the standard header to files that lack it, such as examples/Nat/TcpNAT.cs and examples/Nat/UdpNAT.cs.

Seeing that the classes in Nat\ have proliferated it would be very helpful to have a README.md file there that briefly describes the role of each class and how it relates to others (e.g., "class X uses class Y", "class X extends class Y", "class X contains class Y because...", etc), so one can more easily obtain an idea of the overall class structure.

The testing infrastructure in the Python script looks very sophisticated, it's great practice that you're developing that together with the code it tests! I'd say that feature-wise the NAT is at a great place. Thanks for restructuring it, extending it, and taking care of subtle cases to do with timers etc.

niksu commented 8 years ago

Quick question, could you confirm if all the points that were raised during the discussion of https://github.com/niksu/pax/pull/9 were addressed in the current PR? Thanks!

TMVector commented 8 years ago

I think I have addressed all your feedback from this and the last PR. Could you please check the description of this method - I have updated it, and I want to make sure it's clear what it is for from the comments in the code. Thanks :smiley:

niksu commented 8 years ago

Since the NAT is in its own directory under examples, it seems suitable to move nat_* files under that directory too. Currently this set consists of nat_wiring.son, nat_topo.py, and nat_scapy_tests.py.

niksu commented 8 years ago

On your setup does the mininet test show WARNING scapy test #1 failed. client 126, server 126 and WARNING scapy test #2 failed. client 126, server 126 too, in output from the following command?

$ sudo ./examples/nat_topo.py --no-X test
action: test, X_windows: False, hold_open: False
*** Creating network
*** Adding controller
*** Adding hosts:
in1 in2 nat0 out0 
*** Adding switches:

*** Adding links:
(nat0, in1) (nat0, in2) (nat0, out0) 
*** Configuring hosts
in1 in2 nat0 out0 
*** Starting controller
c0 
*** Starting 0 switches

Network started
Setting the gateway on in1 to nat0
Setting the gateway on in2 to nat0
Starting Pax NAT process on nat0:
  nat0> $ Bin/Pax.exe examples/nat_wiring.json examples/Bin/Examples.dll
Connecting from in1 to out0:
  out0> $ echo Hello, are you there? | netcat -l 12001 &
  in1> $ netcat -n 10.0.0.4 12001
  in1> RCV: 'Hello, are you there?'
Correct data received

Sending data via UDP from in1 to out0:
  out0> $ echo Yes, UDP | nc -u -l 12001 -q1
  in1> $ echo Hello, are you there UDP? | nc -u 10.0.0.4 12001 -q1
  out0> RCV: 'Hello, are you there UDP?'
  in1> RCV: 'Yes, UDP'
Correct data received on the server
Correct data received on the client

Scapy test #1
  out0> $ examples/nat_scapy_tests.py server 35001
  in1> $ sleep 1
  in1> $ examples/nat_scapy_tests.py client
  in1> $ echo $?
  out0> $ echo $?
WARNING scapy test #1 failed. client 126, server 126

Scapy test #2
  out0> $ examples/nat_scapy_tests.py server2 35002
  in1> $ sleep 1
  in1> $ examples/nat_scapy_tests.py client2
  in1> $ echo $?
  out0> $ echo $?
WARNING scapy test #2 failed. client 126, server 126
Show Pax output? (y/N)
niksu commented 8 years ago

I'm not sure if this is a more general issue, but mininet seems to break when I leave the CLI. I load the run() function as follows:

$ sudo ./examples/nat_topo.py --no-X run
action: run, X_windows: False, hold_open: False
*** Creating network
*** Adding controller
*** Adding hosts:
in1 in2 nat0 out0 
*** Adding switches:

*** Adding links:
(nat0, in1) (nat0, in2) (nat0, out0) 
*** Configuring hosts
in1 in2 nat0 out0 
*** Starting controller
c0 
*** Starting 0 switches

Network started
Setting the gateway on in1 to nat0
Setting the gateway on in2 to nat0
*** Starting CLI:
mininet> 

If I then press ^D I get:

*** Stopping 1 controllers
c0 
*** Stopping 3 links
...
*** Stopping 0 switches

*** Stopping 4 hosts
in1 in2 nat0 out0 
*** Done
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/usr/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
    func(*targs, **kargs)
  File "build/bdist.linux-x86_64/egg/mininet/cli.py", line 87, in <lambda>
    atexit.register( lambda: write_history_file( history_path ) )
IOError: [Errno 13] Permission denied
Error in sys.exitfunc:
Traceback (most recent call last):
  File "/usr/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
    func(*targs, **kargs)
  File "build/bdist.linux-x86_64/egg/mininet/cli.py", line 87, in <lambda>
IOError: [Errno 13] Permission denied

Do you get similar behaviour at your end?

niksu commented 8 years ago

Great, thanks! I think it's much clearer.

On 2016-07-18 18:25, TMVector wrote:

I think I have addressed all your feedback from this and the last PR. Could you please check the description of this [1] method - I have updated it, and I want to make sure it's clear what it is for from the comments in the code. Thanks 😃

You are receiving this because you commented. Reply to this email directly, view it on GitHub [2], or mute the thread [3].

*

Links:

[1] https://github.com/niksu/pax/pull/12/files#diff-3465279ee2070c9b5f37aefb4a0d1cfcR224 [2] https://github.com/niksu/pax/pull/12#issuecomment-233397561 [3] https://github.com/notifications/unsubscribe-auth/ABDTTlLJyRzU8p9u5IrF40yxt_W4ERCSks5qW7b-gaJpZM4JNj7N

http://www.cl.cam.ac.uk/~ns441/

TMVector commented 8 years ago

Regarding the Mininet/testing errors, I have not seen either before. I have fixed the shebang in the python scripts in case it helps with the 126 exit codes, but I don't know what would cause that. According to Google, 126 normally means that the command isn't executable, like /dev/null, so it could be a permissions issue?

I just tried testing over SSH from my laptop, and it worked okay, but I had to run chmod a+x examples/Nat/*.py first. I'm not sure how to fix that permanently though.

Aside from that, I think I have addressed all of your feedback - thanks very much for it :)

There are still a couple of things I would like to fix at some point:

niksu commented 8 years ago

Thanks, I'll retry running the commands involved in comments https://github.com/niksu/pax/pull/12#issuecomment-234261042 and https://github.com/niksu/pax/pull/12#issuecomment-234263211 and get back to you.

Regarding the fragility of the tests, perhaps take a look at the P4Paxos setup at https://github.com/usi-systems/p4paxos-demo/tree/master/p4paxos/bmv2 to enable us to test the Pax implementation of Paxos on that setup.

I think chmod a+x examples/Nat/*.py can be "committed" to the repo -- IIRC git also tracks rudimentary file permissions.

Finally, could you merge the current master branch this this PR's branch so the PR can be automatically mergeable via GitHub? I think the merge should be minor, probably involving a tiny conflict on Examples.csproj. Thanks again for your sustained effort on this, it's a great addition to the functionality and usability of Pax!

On 2016-07-26 15:01, TMVector wrote:

Regarding the Mininet/testing errors, I have not seen either before. I have fixed the shebang in the python scripts in case it helps with the 126 exit codes, but I don't know what would cause that. According to Google [1], 126 normally means that the command isn't executable, like /dev/null, so it could be a permissions issue?

I just tried testing over SSH from my laptop, and it worked okay, but I had to run chmod a+x examples/Nat/*.py first. I'm not sure how to fix that permanently though.

Aside from that, I think I have addressed all of your feedback - thanks very much for it :)

There are still a couple of things I would like to fix at some point:

  • The scapy tests are too fragile
  • The output of the scapy tests is only visible in the opened terminal windows, so you can't see it when using --no-X, or (vitally) if the error causes the terminal to close.

You are receiving this because you commented. Reply to this email directly, view it on GitHub [2], or mute the thread [3].

*

Links:

[1] http://tldp.org/LDP/abs/html/exitcodes.html [2] https://github.com/niksu/pax/pull/12#issuecomment-235276667 [3] https://github.com/notifications/unsubscribe-auth/ABDTTrqjDTI85fUHV-sTYoRnXP6YqpO4ks5qZhM2gaJpZM4JNj7N

http://www.cl.cam.ac.uk/~ns441/

niksu commented 8 years ago

Thanks for looking into https://github.com/niksu/pax/pull/12#issuecomment-234261042, I changed the permissions as you suggested:

$ git status
...
    modified:   examples/Nat/nat_scapy_tests.py
    modified:   examples/Nat/nat_topo.py
    modified:   examples/pax_mininet_node.py

where the change was visible as

$ git diff
diff --git a/examples/Nat/nat_scapy_tests.py b/examples/Nat/nat_scapy_tests.py
old mode 100644
new mode 100755
diff --git a/examples/Nat/nat_topo.py b/examples/Nat/nat_topo.py
old mode 100644
new mode 100755
diff --git a/examples/pax_mininet_node.py b/examples/pax_mininet_node.py
old mode 100644
new mode 100755

and this seems to have fixed the issue, so it's worth committing this permission change to the repo. This is the output I got:

$ sudo ./examples/nat_topo.py --no-X test
sudo: ./examples/nat_topo.py: command not found
pal-pursuit1:pax$ sudo ./examples/Nat/nat_topo.py --no-X test
action: test, X_windows: False, hold_open: False
*** Creating network
*** Adding controller
*** Adding hosts:
in1 in2 nat0 out0 
*** Adding switches:

*** Adding links:
(nat0, in1) (nat0, in2) (nat0, out0) 
*** Configuring hosts
in1 in2 nat0 out0 
*** Starting controller
c0 
*** Starting 0 switches

Network started
Setting the gateway on in1 to nat0
Setting the gateway on in2 to nat0
Starting Pax NAT process on nat0:
  nat0> $ Bin/Pax.exe examples/Nat/nat_wiring_test.json examples/Bin/Examples.dll
Connecting from in1 to out0:
  out0> $ echo Hello, are you there? | netcat -l 12001 &
  in1> $ netcat -n 10.0.0.4 12001
  in1> RCV: 'Hello, are you there?'
Correct data received

Sending data via UDP from in1 to out0:
  out0> $ echo Yes, UDP | nc -u -l 12001 -q1
  in1> $ echo Hello, are you there UDP? | nc -u 10.0.0.4 12001 -q1
  out0> RCV: 'Hello, are you there UDP?'
  in1> RCV: 'Yes, UDP'
Correct data received on the server
Correct data received on the client

Scapy test #1
  out0> $ examples/Nat/nat_scapy_tests.py server 35001
  in1> $ sleep 1
  in1> $ examples/Nat/nat_scapy_tests.py client
  in1> $ echo $?
  out0> $ echo $?
Scapy test #1 passed

Scapy test #2
  out0> $ examples/Nat/nat_scapy_tests.py server2 35002
  in1> $ sleep 1
  in1> $ examples/Nat/nat_scapy_tests.py client2
  in1> $ echo $?
  out0> $ echo $?
Scapy test #2 passed
Show Pax output? (y/N)

*** Stopping 1 controllers
c0 
*** Stopping 3 links
...
*** Stopping 0 switches

*** Stopping 4 hosts
in1 in2 nat0 out0 
*** Done
niksu commented 8 years ago

The problem described in https://github.com/niksu/pax/pull/12#issuecomment-234263211 still seems to be live, even with the permissions changed as mentioned in https://github.com/niksu/pax/pull/12#issuecomment-235276667 and https://github.com/niksu/pax/pull/12#issuecomment-235435849. Do you get similar behaviour, or is the shut-down tidy when you press ^D?

$ sudo ./examples/Nat/nat_topo.py --no-X run
action: run, X_windows: False, hold_open: False
*** Creating network
*** Adding controller
*** Adding hosts:
in1 in2 nat0 out0 
*** Adding switches:

*** Adding links:
(nat0, in1) (nat0, in2) (nat0, out0) 
*** Configuring hosts
in1 in2 nat0 out0 
*** Starting controller
c0 
*** Starting 0 switches

Network started
Setting the gateway on in1 to nat0
Setting the gateway on in2 to nat0
*** Starting CLI:
mininet>

(then I press ^D)

*** Stopping 1 controllers
c0 
*** Stopping 3 links
...
*** Stopping 0 switches

*** Stopping 4 hosts
in1 in2 nat0 out0 
*** Done
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/usr/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
    func(*targs, **kargs)
  File "build/bdist.linux-x86_64/egg/mininet/cli.py", line 87, in <lambda>
    atexit.register( lambda: write_history_file( history_path ) )
IOError: [Errno 13] Permission denied
Error in sys.exitfunc:
Traceback (most recent call last):
  File "/usr/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
    func(*targs, **kargs)
  File "build/bdist.linux-x86_64/egg/mininet/cli.py", line 87, in <lambda>
IOError: [Errno 13] Permission denied
TMVector commented 8 years ago

The shutdown is tidy when I run it. Perhaps try

sudo mn

and

sudo mn --custom examples/Nat/nat_topo.py --topo nat

and see if those error when you do ^D.

niksu commented 8 years ago

You're right, even sudo mn results in this issue on my setup. It seem to be a permission issue relating to ~/.mininet_history. Thank you for looking into it. Are there any outstanding issue? If not then we can merge this PR in.

TMVector commented 8 years ago

I believe that all the issues are resolved. I have done a small refactor of the TcpState class in an attempt to make the state transition clearer.

niksu commented 8 years ago

That's superb! Many thanks for this great contribution and the sustained effort it required.