alericardi / arduino

Automatically exported from code.google.com/p/arduino
0 stars 0 forks source link

Ethernet/UDP libraries loses packets if there are multiple packets in w5100 buffer #669

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

1. Send multiple UDP packets to Arduino board.

2. Call Udp->readPacket, e.g.

const int max_msg = 20;
byte msg[max_msg];

Udp.readPacket(&msg, max_msg, remoteIp, &remotePort);

If the sum of the UDP packets in the W5100 receive buffer are greater than 
max_msg then the UDP library interprets this as being a single packet which is 
too long for your buffer. The UDP library then uses a work-around which reads 
in the packet up to max_msg discarding any more data in the w5100 receive 
buffer.

The work-around in the UDP library exists to work-around a bug in the 
socket.cpp library which doesn't check the length of your buffer before writing 
data to it. 

What version of the Arduino software are you using? On what operating
system?  Which Arduino board are you using?

I'm using 0022 on Ubuntu Linux 10.04. I'm using a Freetronics EtherTen (Uno + 
Ethernet Shield compatible).

Please provide any additional information below.

I've fixed the buffer-over flow bug in socket.cpp which means the work-around 
in the UDP library can be removed. The upshot is considerably simpler and 
doesn't lose packets :)

Fixed versions of Udp.cpp and socket.cpp attached.

While trying to figure out how to submit bugs I see IDE version 1.0rc1 is out. 
The Ethernet/UDP code has the same bug in socket.cpp but it looks like the new 
UDP library doesn't use the faulty routine. I think the same problem exists in 
the new library but I'm not 100% sure at this stage. I'll update this bug when 
I've got my head around the new UDP library and try and supply a fix if 
necessary.

Thanks,

Dylan Hall

Original issue reported on code.google.com by dy...@deedums.com on 5 Oct 2011 at 9:14

Attachments:

GoogleCodeExporter commented 9 years ago
I've confirmed the problem also exists in the 1.0 release candidate, in fact, 
it's worse. If multiple packets are waiting in the receive buffer the code can 
merge multiple packets into a single read and lose track of the UDP headers 
separating them resulting in corrupted data.

Attached are fixed versions of EthernetUDP.h and EthernetUDP.cpp.

I've tested these with my application and they work fine :)

The bug in socket.cpp/recvfrom still exists (over flowing it's buffers) so 
should probably be fixed but doesn't directly affect the new UDP code.

Original comment by dy...@deedums.com on 5 Oct 2011 at 11:14

Attachments:

GoogleCodeExporter commented 9 years ago
Attached to this comment is an updated socket.cpp for Arduino 1.0rc-1 that 
addresses the buffer overflow issues in recvfrom.

The fix from 0022 for UDP is included here although the new UDP library no 
longer uses recvfrom. I've tested this fix in 0022.

I've applied the same fix to IPRAW and MACRAW modes although I have no means of 
testing them at this stage.

Original comment by dy...@deedums.com on 6 Oct 2011 at 8:21

Attachments:

GoogleCodeExporter commented 9 years ago
Adrian, can you take a look at this one?

Original comment by dmel...@gmail.com on 10 Oct 2011 at 3:53

GoogleCodeExporter commented 9 years ago
I can confirm command 1, where packets become horribly corrupted if multiple 
UDP packets are send. I had to write my software to throttle them to be able to 
properly receive

Original comment by robindegen on 20 Oct 2011 at 8:13

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Just updated my whole project to 1.0rc and applied the above patch on 
ethernetudp and it works great now. Thanks all!

Original comment by robindegen on 21 Oct 2011 at 7:47

GoogleCodeExporter commented 9 years ago
Issue #734 is somewhat related to the issue of this as well even though it 
concerns DHCP. Just linking it here in case of the information perhaps beeing 
useful.

Original comment by pe...@birchroad.net on 2 Dec 2011 at 3:32

GoogleCodeExporter commented 9 years ago
A slightly updated version of EthernetUdp.cpp which I hope will address the 
DHCP issue above (issue 734).

I've changed EthernetUdp.available() to return the number of bytes available in 
the current packet rather than the number available in the W5100 buffer. This 
means that parsePacket() has to be called before relying on available(). 

Original comment by dy...@deedums.com on 3 Dec 2011 at 7:40

Attachments:

GoogleCodeExporter commented 9 years ago
It works now with the latest changes to .available().
Question though, will there ever be any need to have access to buffer 
operations as well.
EthernetUDP::flush is the first one that comes to mind.

Original comment by pe...@birchroad.net on 5 Dec 2011 at 7:42

GoogleCodeExporter commented 9 years ago
I can't think of any reason you'd need access to the underlying buffer. In 
fact, if we assume that at some point in the future the library will support 
multiple bits of Ethernet hardware then a good abstraction will hide all of the 
hardware specifics from the user.

If you wanted the old behavior of flush, e.g. ditch everything in the buffer 
rather than just the current packet then the following should do the job:

while (udp.parsePacket()) {
  udp.flush();
}

I suspect this is a discussion that should include some Arduino people with a 
bit more experience than me. At the time I submitted this bug I'd only been 
playing with Arduino's for a couple of weeks :)

Original comment by dy...@deedums.com on 5 Dec 2011 at 10:47

GoogleCodeExporter commented 9 years ago
Here is the fixes from the previous comments as a patch against the latest code 
in git.
Also available as a pull request no 53 in github.

Original comment by pe...@birchroad.net on 7 Dec 2011 at 9:08

Attachments:

GoogleCodeExporter commented 9 years ago
I haven't had chance to test the submitted patch, but from a read through the 
pull request in github it looks good to me.  I'll add another comment when I've 
had chance to test it (assuming it doesn't get added in the meantime :-) but 
it'll probably be a week or so before I'll have time.

Original comment by adrian.m...@gmail.com on 9 Dec 2011 at 1:07

GoogleCodeExporter commented 9 years ago

Original comment by dmel...@gmail.com on 16 Dec 2011 at 10:08

GoogleCodeExporter commented 9 years ago
Thank you to Peter!! The patch "fix_issue_669.patch" works perfectly. No more 
reading between/in-the-middle-of packets! Thanks to everyone.

Original comment by mavaddat...@gmail.com on 25 Jan 2012 at 9:42

GoogleCodeExporter commented 9 years ago
Thanks so much, I knew something fishy was going on when I kept getting corrupt 
data packets. Your fix makes everything work 100% as expected, it should 
definitely be in the retail release. Thanks again!

Original comment by mjoll...@gmail.com on 29 Jan 2012 at 1:30

GoogleCodeExporter commented 9 years ago
I can't really take credit for the actual code in the fix. I just made it in a 
manageable format.

But please, someone with access, commit it or tell what to do to get it 
committed.

Original comment by pe...@birchroad.net on 29 Jan 2012 at 6:33

GoogleCodeExporter commented 9 years ago

Original comment by dmel...@gmail.com on 15 Feb 2012 at 12:18

GoogleCodeExporter commented 9 years ago
Does someone have a good / simple way to test the fix?  

Also, it looks like the recvfrom() method in socket.cpp isn't used anywhere.  
Can anyone explain why we need to make changes to an unused function?  (Or am I 
missing a place where it's called?)

Original comment by dmel...@gmail.com on 15 Feb 2012 at 12:36

GoogleCodeExporter commented 9 years ago
Okay, I applied the changes to EthernetUDP.cpp and .h.  Can anyone verify that 
this fixes the problem?

https://github.com/arduino/Arduino/commit/561cd7054d6211cad71d8adbd9b1af3f087cb8
c6

Original comment by dmel...@gmail.com on 15 Feb 2012 at 12:43

GoogleCodeExporter commented 9 years ago
My original fix was for 0.22 which used the recvfrom function. The newer UDP 
code in 1.0 doesn't use recvfrom but I included the fix anyway. I figured if 
the function still existed it was best to fix it. Also the recvfrom function 
handles IPRAW and MACRAW modes which at the time I wrote the patch I was 
intending to have a play with.

I'll try and verify the fix tomorrow night. Do you want me to just verify it or 
post my test code here so others can double check it?

Thanks :) 

Original comment by dy...@deedums.com on 15 Feb 2012 at 12:59

GoogleCodeExporter commented 9 years ago
I've tested with the commit above and it's working fine :)

Attached is a perl script that will send 10 UDP packets each with a sequence 
number and some padding.

The attached sketch reads all the incoming packets and prints the sequence 
number.

A good run should look like:

Starting Up
PacketSize: 13 Sequence Number is: 0
PacketSize: 13 Sequence Number is: 1
PacketSize: 13 Sequence Number is: 2
PacketSize: 13 Sequence Number is: 3
PacketSize: 13 Sequence Number is: 4
PacketSize: 13 Sequence Number is: 5
PacketSize: 13 Sequence Number is: 6
PacketSize: 13 Sequence Number is: 7
PacketSize: 13 Sequence Number is: 8
PacketSize: 13 Sequence Number is: 9

A run with the unpatched UDP library looks like:

Starting Up
PacketSize: 202 Sequence Number is: 0
PacketSize: 193 Sequence Number is: 100
PacketSize: 184 Sequence Number is: 36
PacketSize: 175 Sequence Number is: 112
PacketSize: 166 Sequence Number is: 1
PacketSize: 157 Sequence Number is: 109
PacketSize: 148 Sequence Number is: 103
PacketSize: 139 Sequence Number is: 3
PacketSize: 130 Sequence Number is: 100
PacketSize: 121 Sequence Number is: 36
PacketSize: 112 Sequence Number is: 112
PacketSize: 103 Sequence Number is: 1
PacketSize: 94 Sequence Number is: 109
PacketSize: 85 Sequence Number is: 103
PacketSize: 76 Sequence Number is: 6
PacketSize: 67 Sequence Number is: 100
PacketSize: 58 Sequence Number is: 36
PacketSize: 49 Sequence Number is: 112
PacketSize: 40 Sequence Number is: 1
PacketSize: 31 Sequence Number is: 109
PacketSize: 22 Sequence Number is: 103
PacketSize: 13 Sequence Number is: 9
PacketSize: 4 Sequence Number is: 100

As you can see it's gotten confused about the number of packets, the length of 
the packets, and the sequence numbers are corrupted :(

Original comment by dy...@deedums.com on 16 Feb 2012 at 8:23

Attachments:

GoogleCodeExporter commented 9 years ago
Any idea why this would cause the NTP update example from the Arduino 
playground to fail?  
http://www.arduino.cc/playground/Main/DS1307OfTheLogshieldByMeansOfNTP

Original comment by dan.eb...@gmail.com on 22 Feb 2012 at 10:10

GoogleCodeExporter commented 9 years ago
I can see a possible problem, when using IDE 1.0 there isn't a call to 
Udp.parsePacket() before you call Udp.available() or Udp.read().

Without the Udp patch applied to IDE 1.0 those calls will have worked but 
Udp.read will have returned 8 bytes of Udp header at the beginning of your Ntp 
response so I would have thought it would have broken in interesting ways.

With the Udp patch applied to IDE 1.0 the call to Udp.available() will return 0 
until a successful call to Udp.parsePacket().

In practice you can replace your call to Udp.available() as follows:

if ( Udp.parsePacket() ) {              // New from IDE 1.0
    // read the packet into the buffer
    Udp.read(pb, NTP_PACKET_SIZE);      // New from IDE 1.0,

Hope that helps :)

Original comment by dy...@deedums.com on 23 Feb 2012 at 1:19

GoogleCodeExporter commented 9 years ago
OK, thank you. So the sketch would look like this? 
N.B. This is an excerpt:-

    sendNTPpacket(timeServer);
    delay(1000);
    if ( Udp.parsePacket() ) 
     {              
      Udp.read(pb, NTP_PACKET_SIZE);      
      unsigned long t1, t2, t3, t4;
      t1 = t2 = t3 = t4 = 0;
      for (int i=0; i< 4; i++)
      {
        t1 = t1 << 8 | pb[16+i];      
        t2 = t2 << 8 | pb[24+i];      
        t3 = t3 << 8 | pb[32+i];      
        t4 = t4 << 8 | pb[40+i];
      }

Original comment by dan.eb...@gmail.com on 23 Feb 2012 at 12:31

GoogleCodeExporter commented 9 years ago
That looks good to me, does it work?

Original comment by dy...@deedums.com on 24 Feb 2012 at 12:08

GoogleCodeExporter commented 9 years ago
It sounds like the code is working now, so I'm marking this as fixed.  If 
needed, we can apply the recvfrom() changes too.

Original comment by dmel...@gmail.com on 2 Mar 2012 at 11:08

GoogleCodeExporter commented 9 years ago
Hi all,

I unwrapped and fired up my first Arduino with Ethernet Shield yesterday. I was 
interested in reading some environmental parameters using SNMP.

I very quickly came across issues with UDP requests getting dropped or 
corrupted and started a little digging... and came across this issue reported 
by Dylan. Skipping to the end, I had understood that the problem was solved and 
pulled the latest code (I had initially downloaded 1.0.0). 

However, the problem for me was not resolved - especially when running the pear 
test code repeatedly. I checked the W5100 data sheet and came to the conclusion 
that the "HACK" for retrieving data from the read register was not in line with 
the algorithm proposed by the chip manufacturer.

In particular, maintaining a 'remaining' counter that rapidly looses track of 
the actually contents of the receive buffer is bad practice. I will explain:

1. The UDP packet arrives
2. The ParsePacket() method calls recv() to get the UDP header. However recv() 
updates the W5100 receive register pointer through recv_data_processing().
3. All data then read iteratively through the two read methods which use the 
W5100 receive register pointer to locate the start of the data.

However, if a new packet is received between steps 2 and 3, everything gets 
corrupted: The current packet being read, the 'remaining' counter and the new 
packet that has arrived.

Why isn't the socket method recvfrom() being used as this is specifically for 
UDP data and is inline with the W5100 data sheet? I wrote a little more code 
that overrides the ParsePacket() method to take a pointer to a buffer and the 
max length of the application buffer as parameters. This method called the 
recvfrom() method and seemed more reliable but there were still random errors 
which I traced down to the fact that the socket methods do not do any bounds 
checking when copying data to the application buffer.

After finding this problem and correcting it, I realise that this was actually 
reported and solved by Dylan in the initial post - but for some reason it seems 
to have been dropped in the final fix?

In any case, I have attached my fix, plus diff for completeness. I have not 
changed the other methods to ensure compatibility with other applications, but 
I personally think that the UDPEthernet class needs to have the ParsePacket() 
method plus the read() methods deprecated - but perhaps there are some 
scenarios that I have not considered as I am new to this environment.

Original comment by graeme.r...@gmail.com on 31 Dec 2012 at 9:40

Attachments:

GoogleCodeExporter commented 9 years ago
Hello all,

Just to follow up on my earlier post - to correct the wrongs of my ways and not 
leave something incomplete for someone else to trip over.

I take back what I said about the read, available and other methods after a few 
weeks of Arduino programming under the belt, I understand that the reasoning 
was to make it more a "stream like" library.

Having said that, there is still a possibility that the recvfrom method in the 
socket library can cause problems because it does not take into account the 
size of the memory buffer that it was passed but rather it uses the size of the 
UDP buffer. This needs to be fixed.

In addition this method is perfect for handling UDP frames, so it should be 
used instead of a "hack" in the UDP library. I have updated my parsePacket 
method to call the recvfrom method - however seeing as this method does not 
seem to be used in any other code, I guess the potentially problem has gone 
unnoticed.

Additionally as good practice, I would recommend sending the RECV command every 
time the SnRX_RD register is updated. Any subsequent reads to the register 
before the RECV command return 0 otherwise.

Otherwise - just a word of thanks to all that have developed this - I have had 
some great fun indoors with the snow falling outside playing with my new 
Arduino! 

Original comment by graeme.r...@gmail.com on 22 Jan 2013 at 7:19

Attachments:

GoogleCodeExporter commented 9 years ago
Do we know if and when Graeme.r's latest offerings are likely to be built into 
the libraries? or does the fixed status mean that they won't get included!!

I'm still getting corrupted/missed NTP packets especially with big sketches. At 
first I thought it was a SRAM variable vs stack issue but now I'm not so sure.

The corruptions were usually consistant but different for each thing I tried to 
reduce the requirement for SRAM.

Original comment by brucedur...@btinternet.com on 10 Mar 2013 at 5:50

GoogleCodeExporter commented 9 years ago
Bruce, the fix proposed by Graeme does fixes your issues? Knowing this may 
accelerate the review & integration of the fix.

C

Original comment by c.mag...@arduino.cc on 10 Mar 2013 at 6:12

GoogleCodeExporter commented 9 years ago
Hi, c.mag..
The short answer is I dont know if they fix the problem just yet. In the best 
traditions I'm changing just one thing at a time. And I'm only just getting to 
Graemes patches. I'm using 1.0.1 at the moment on a uno

The sketch usually runs for a day or so before it falls over so it takes a 
while. I'm counting one second interupts (systime) and then using 
if(systime%86400) to detect midnight and get a UDP time update Some times it 
falls over at that point ans some times it doesn't. If I change the divisor to 
say 1800 or 30minute intervals it might call UPD 5 or 5 times before it crashes.

PS may be I'm being a bit thick but how do I splice the patches into the 
library? I take it that his *.cpp and *.h files are a straight replacements but 
are the Jan 22nd offerings yet more changes? 

Original comment by brucedur...@btinternet.com on 13 Mar 2013 at 3:44

GoogleCodeExporter commented 9 years ago
Hi Graeme.r I've got to the point in my project where I need to try your 
patches to EthernetUdp W5100 and Socket.ccp. So fundamental question are your 
22nd Jan patches for the IDE code or are they a modification of the code you 
gave on 31st Dec? 

Also I'm making a hash of splicing the patches in by hand is there an automated 
method of merging the files?

Original comment by brucedur...@btinternet.com on 13 Apr 2013 at 5:58

GoogleCodeExporter commented 9 years ago
This definitely is still a problem in the latest release of the SDK. If I send 
several small UDP packets in quick succession (16 - 32 bytes) parsepacket will 
suddenly report a gigantic packet of 25952 bytes (not sure why it's always this 
number) and the arduino will immediately crash. I'm using faders and buttons in 
TouchOSC to generate the UDP packets that crash the arduino. Graeme.r's fix 
does seem to solve the problem. This should really be addressed in the actual 
SDK... I spent many many hours trying to figure out why it was crashing.

Original comment by mike.ila...@gmail.com on 24 Apr 2013 at 4:43

GoogleCodeExporter commented 9 years ago
brucedur, I think the patches are for the original code, but I'm not sure... I 
ended up manually applying them to the version he posted on the 31st... but 
there were definitely discrepancies. The patches are important because without 
them, i was running into cases where the buffer size was reporting strange 
numbers and not properly emptying.

Original comment by mike.ila...@gmail.com on 24 Apr 2013 at 6:22

GoogleCodeExporter commented 9 years ago
Please can someone help me, i copied the code same as on the arduino site
http://playground.arduino.cc/Main/DS1307OfTheLogshieldByMeansOfNTP
but keep getting message udp not available. Why is that how to fix it, i have 
working internet connection

Original comment by nihadkav...@gmail.com on 14 Sep 2013 at 11:18

GoogleCodeExporter commented 9 years ago
It seems there has been no activity here for over a year.

I wish to thank Greame for his contribution. I've been try to figure out a UDP 
crashing issue for over a week and came across this info. I'm using Arduino IDE 
1.0.6 and it appears to still have problems in the libraries with multiple UDP 
devices causing ethernet lock-ups. I applied the patches from Graeme's March 
10, 2013 post to the 1.0.6 libraries and it's solved my problems.

I'm using a Freetronics Ethermega as a server for a home automation system. 
There are 6 Etherten clients set up as touchscreen controllers all using UDP to 
report back to the server (using 4D Systems LCD's). The server responds to 
requests for data from the clients and sometimes initiates a broadcast if all 
clients need to change state as part of the control state machines. Initially 
everything was fine and stable but as the project grew stability got worse. 
There were other problems with some of the Arduino libraries, mainly to do with 
blocking code and peripherals not responding (TWI & analogueRead). Putting a 
while loop with no timeout in a library is a recipe for disaster... 

Once the project progressed to where 6 Etherten boards talking to the Ethermega 
the system would lock-up anywhere between a few seconds after cold-boot to 24 
hours later but usually within minutes. After applying the patches I haven't 
had a lock-up in over 2 days. 

There is a watchdog also implemented as a workaround to the lock-ups prior to 
applying Graeme's patches but so far it hasn't triggered.

Graeme's analysis of the W5100 datasheets has led to relatively simple mods to 
the library that seem to make it work very reliably when there is a lot of UDP 
traffic. It also appears there has been no change to the 1.0.6 libraries since 
2013 as they still seem to match the files listed in Graeme's earlier post...

Again, thanks Graeme, you've saved me many hours trying to debug the Ethernet 
UDP!!!

Original comment by war...@kandsaudio.com on 13 Nov 2014 at 7:00

GoogleCodeExporter commented 9 years ago
This should go into the github issues as well somehow.
The issue is closed there when it perhaps should not.

https://github.com/arduino/Arduino/issues/669

Original comment by peter.ma...@mustad.se on 16 Dec 2014 at 9:18

GoogleCodeExporter commented 9 years ago
i also stumbled over this bug
had lost UDP packets 
"solved" it with bigger buffer 

has this bug realy not been fixed yet?

Original comment by lechnerr...@gmail.com on 5 Mar 2015 at 8:34