Closed GoogleCodeExporter closed 9 years ago
So it seems PktsIn has been deprecated in Web100 and Web10G.
The calculation for this is server side dup DupAcksIn/AckPktsIn.
When running Web100 2.5.30 for Linux 2.6.35(Possibly earlier also) the problem
is that AckPkts in is no longer being returned. So this must be using a bogus
value in the calculation.
Looking at the kernel patch
AckPktsIn == PktsIn - DataPktsIn;
I will work on a patch for this.
Original comment by rsanger...@gmail.com
on 28 Aug 2013 at 12:06
For context, M-Lab runs a patched 2.6.32 kernel with web100 2.5.27 (for kernel
2.6.32).
The patch, exactly as applied, is at the link below:
http://git.planet-lab.org/?p=linux-2.6.git;a=blob;f=linux-2.6-690-web100.patch;h
b=refs/heads/rhel6-mlab
Original comment by stephen....@gmail.com
on 28 Aug 2013 at 12:14
Thanks for the patch your using, it appears to be the same problem.
Correction to my previous comment AckPktsIn has been deprecated, not PktsIn.
Original comment by rsanger...@gmail.com
on 28 Aug 2013 at 12:42
Does this patch to web100-util.c work?
Original comment by AaronMat...@gmail.com
on 28 Aug 2013 at 12:52
Attachments:
I've tested the patch locally and it appears to be working, commit at will.
Testing on my local machine I noticed my earlier statement wasn't correct
(Web100 wont return AckPktsIn), it seems it will, however a warning will be
printed every time saying accessing a deprecated variable.
In saying that it might not hold true all versions of Web100.
So I'm guessing that the M-Lab setup might not have AckPktsIn listed in there
web_variables file (-f option) can you confirm this Stephen?
Either way we shouldn't be relying on deprecated variables.
Original comment by rsanger...@gmail.com
on 28 Aug 2013 at 10:02
I'm not sure I understand the question.
The file attached is the content of /proc/web100/header. It includes AckPktsIn
prefixed with '_'. (Is that significant?)
And, ndtd is started with only these arguments:
--log_dir $SLICERSYNCDIR/ --snaplog --tcpdump --cputime --multiple --max_clients=40
"-f" ( -f, --file variable_FN - specify alternate 'web100_variables' file) is
not used.
Is there a problem? Or, a preferable set of options?
Original comment by stephen....@gmail.com
on 31 Aug 2013 at 4:23
Attachments:
Looks like the '_' prefix is what Web100 is adding to deprecated variables.
There is a default path used when '-f' isn't specified which is
/usr/local/ndt/web100_variables. Or failing that run the server with -d added
and it should print out the location:
Variables file = <location>
This file is a list of Web100 variables which the server collects and sends
back to the client, check if AckPktsIn is in that list.
Original comment by rsanger...@gmail.com
on 1 Sep 2013 at 2:08
Aha!
Yes, AckPktsIn & AckPktsOut are in that file. The file used by m-lab is the
default included in the NDT source package (also attached). These two also
appear in /proc/web100/header with the '_' prefix. Of the other variables
prefixed with '_' in /proc/web100/header, none are in the web100_variables file.
Should those variable names be removed? What are the consequences of altering
it? i.e. would any clients break?
Original comment by stephen....@gmail.com
on 1 Sep 2013 at 10:42
Attachments:
You should keep AckPktsIn & AckPktsOut on that list, although being deprecated
in Web100 ndt still expects them to be there.
The reason I asked was that when I ran web100clt against your server the list
of Web100 variables returned was missing AckPktsIn, so I wondered if this was
missing from your web100_variables file. The interesting thing here is that
AckPktsOut (which is also deprecated and seems identical to AckPktsIn) is being
returned although AckPktsIn is not.
When I try the same thing locally I see AckPktsIn returned.
I don't know the reason for this difference in behavior.
Do you mind applying Aaron's fix_dup_ack_calculation.patch so we can see if
this fixes the calculation. I wouldn't expect this to make AckPktsIn be
returned - but it should fix the calculation.
Thanks for your help,
Original comment by rsanger...@gmail.com
on 1 Sep 2013 at 11:51
I was curious if you'd been able to test out the patch or not
Original comment by AaronMat...@gmail.com
on 11 Sep 2013 at 7:58
Is the patch for the server? It is helpful to apply this patch to the server
on mlab for testing?
Original comment by solt...@opentechinstitute.org
on 11 Sep 2013 at 8:12
Yep, it's a patch for the web100srv. It should hopefully fix the out-of-order
packet percentage issue we were seeing on mlab.
Original comment by AaronMat...@gmail.com
on 11 Sep 2013 at 8:32
Applied:
http://ndt.iupui.mlab1.nuq0t.measurement-lab.org:7123/
Original comment by solt...@opentechinstitute.org
on 11 Sep 2013 at 9:14
I'm no longer seeing absurdly high out-of-order numbers, mainly because I'm not
seeing any :) Ryan, could you test, and see if you see any. If it works for
you, we're probably good for an -rc release.
Original comment by AaronMat...@gmail.com
on 12 Sep 2013 at 12:30
I have retested and it appears to be working as expected. I'm seeing
out-of-order packets anywhere from 0% to 10% of the time over a couple of tests
which looks correct. This matches with the Web100 variables being returned.
Looks like the patch worked.
Original comment by rsanger...@gmail.com
on 12 Sep 2013 at 10:10
Excellent, closing.
Original comment by AaronMat...@gmail.com
on 13 Sep 2013 at 6:40
Original issue reported on code.google.com by
rsanger...@gmail.com
on 27 Aug 2013 at 11:18Attachments: