ezieragabriel / arduino

Automatically exported from code.google.com/p/arduino
Other
0 stars 0 forks source link

Ethernet library can hang during data transmission (in socket.cpp: send() ) #1049

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
Difficult to catch the problem since it depends on timing of sending data from 
the shield while incoming data.  Appears to be a coincidence issue on the exact 
timing.   I'm getting it to happen by continually sending data at the fastest 
rate possible on a TCP socket while the arduino program sends back replies 
using a sliding window.   Even at that, the problem takes several hours to hit, 
but when it does it is catastrophic.

What version of the Arduino software are you using? On what operating
system?  Which Arduino board are you using?
Arduino 1.01 with a Mega 2560 and ethernet shield (W5100).

Please provide any additional information below.
I've traced the problem to socket.cpp:send() by putting debug outputs to the 
serial port if it gets stuck in a loop there.   Here's the instrumented code:

uint16_t send(SOCKET s, const uint8_t * buf, uint16_t len)
{
  uint8_t status=0;
  uint16_t ret=0;
  uint16_t freesize=0;

  if (len > W5100.SSIZE) 
    ret = W5100.SSIZE; // check size not to exceed MAX size.
  else 
    ret = len;

  Serial.print('$');

  // if freebuf is available, start.
  int loopcnt=0;
  do 
  {
    freesize = W5100.getTXFreeSize(s);
    status = W5100.readSnSR(s);
    if ((status != SnSR::ESTABLISHED) && (status != SnSR::CLOSE_WAIT))
    {
      ret = 0; 
      break;
    }
    loopcnt++;
    if (loopcnt==10)
    Serial.println("send.1");
  } 
  while (freesize < ret);

  // copy data
  W5100.send_data_processing(s, (uint8_t *)buf, ret);
  W5100.execCmdSn(s, Sock_SEND);

  /* +2008.01 bj */
  loopcnt=0;
  while ( (W5100.readSnIR(s) & SnIR::SEND_OK) != SnIR::SEND_OK ) 
  {
    /* m2008.01 [bj] : reduce code */
    if ( W5100.readSnSR(s) == SnSR::CLOSED )
    {
      close(s);
      return 0;
    }
    loopcnt++;
    if (loopcnt==1000) {
    Serial.print("send.2, SnSR=0x");
    Serial.print(W5100.readSnSR(s),HEX);
    Serial.print(", SnIR=0x");
    Serial.print(W5100.readSnIR(s),HEX);
    Serial.print(", TX_WR=0x");
    Serial.print(W5100.readSnTX_WR(s),HEX);
    Serial.print(", TX_RD=0x");
    Serial.print(W5100.readSnTX_RD(s),HEX);
    Serial.println(".");
    close(s);
    return 0;
    }
  }
  /* +2008.01 bj */
  W5100.writeSnIR(s, SnIR::SEND_OK);
  return ret;
}

The addition to the code is the loop counter in the second loop that detects 
the condition.   It then prints out the values of various status registers 
which results in:

send.2, SnSR=0x17, SnIR=0x5, TX_WR=0x92A5, TX_RD=0x92A5.

Thus, it looks like everything is OK, except that the SEND bit is never 
signalled in the SnIR register:
SnSR  0x17 = ESTABLISHED
SnIR 0x5 = RECV|CON
TX_WR==TX_RD indicates all data sent (I think??)

It seems that this may be an instance of the reported errata to the 5100 (see 
http://www.techw.co.jp/Download/3150A_5100_errata1_Eng.pdf )

Original issue reported on code.google.com by townsh...@gmail.com on 25 Sep 2012 at 2:13

GoogleCodeExporter commented 9 years ago
Yes, you're right.
When W5100 sends data while it is receiving data, there is timing issue.
If you got debug data like you wrote, I suppose like below.
 1. W5100 sent all data successfull and changed its register, SnIR, to SEND_OK 
 2. but data reception happened right shortly afterward and SnIR was changed to RECV.
 3. Your driver code came to while clause to check its value after SnIR was changed to RECV.

You'd better add additional code to check the value of TX_WR and TX_RD after 
the block checking SnSR.

Then you may be able to solve this problem.

Original comment by java...@gmail.com on 19 Feb 2015 at 9:36