ciaa / firmware_v1

Firmware de la CIAA
http://www.proyecto-ciaa.com.ar
126 stars 120 forks source link

Posix ciaaSerialDevices NXP UART write output not evaluated #453

Closed mabeett closed 7 years ago

mabeett commented 7 years ago

In the current master (8a8e48f55cc4ffb8f6cb19e1688fe05496208ff5)

in extern void ciaaSerialDevices_txConfirmation()

      write = serialDevice->device->write(device->loLayer, ciaaLibs_circBufReadPos(cbuf), rawCount);

      /* update buffer */
      ciaaLibs_circBufUpdateHead(cbuf, write);

      /* if all bytes were written and more data is available */
if ( (write == rawCount) && (count > rawCount ) )

The function does not evaluate the condition 0 < write < rawCount and extern ssize_t ciaaDriverUart_write() may return a value smaller than rawCount when Chip_UART_ReadLineStatus((LPC_USART_T *)device->loLayer) & UART_LSR_THRE:

while((Chip_UART_ReadLineStatus((LPC_USART_T *)device->loLayer) & UART_LSR_THRE) && (ret < size))
      {
         /* send first byte */
         Chip_UART_SendByte((LPC_USART_T *)device->loLayer, buffer[ret]);
         /* bytes written */
         ret++;
}
mabeett commented 7 years ago

This problem would solved in one of this options a) changing the behaibor of extern void ciaaSerialDevices_txConfirmation() b) changing ciaaSerialDevices.c

diff --git a/modules/posix/src/ciaaSerialDevices.c b/modules/posix/src/ciaaSerialDevices.c
index 0a09659..ee4fcb1 100644
--- a/modules/posix/src/ciaaSerialDevices.c
+++ b/modules/posix/src/ciaaSerialDevices.c
@@ -377,6 +377,25 @@ extern ssize_t ciaaSerialDevices_write(ciaaDevices_deviceType const * const devi
    return total;
 }

+
+/* la l'ogica de ciaaSerialDevices_txConfirmation espera que se envie totalmente los datos */
+inline ssize_t full_write(ciaaDevices_deviceType const * const device, uint8_t const * const buffer, size_t const size)
+{
+
+   ciaaSerialDevices_deviceType * serialDevice =
+      (ciaaSerialDevices_deviceType*) device->layer;
+   ssize_t ret = 0;
+   ssize_t write = 0;
+
+   while(ret < size)
+   {
+      write = serialDevice->device->write(device->loLayer, &buffer[ret], size-ret);
+      ret += write;
+   }
+   return ret;
+}
+
+
 extern void ciaaSerialDevices_txConfirmation(ciaaDevices_deviceType const * const device, uint32_t const nbyte)
 {
    /* get serial device */
@@ -393,7 +412,7 @@ extern void ciaaSerialDevices_txConfirmation(ciaaDevices_deviceType const * cons
    if (count > 0)
    {
       /* write data to the driver */
-      write = serialDevice->device->write(device->loLayer, ciaaLibs_circBufReadPos(cbuf), rawCount);
+      write = full_write(device, ciaaLibs_circBufReadPos(cbuf), rawCount);

       /* update buffer */
       ciaaLibs_circBufUpdateHead(cbuf, write);
@@ -405,7 +424,7 @@ extern void ciaaSerialDevices_txConfirmation(ciaaDevices_deviceType const * cons
          rawCount = ciaaLibs_circBufRawCount(cbuf, tail);

          /* write more bytes */
-         write = serialDevice->device->write(device->loLayer,
+         write = full_write(device,
                ciaaLibs_circBufReadPos(cbuf), rawCount);

          if (write > 0)

c) Changing ciaaDriverUart.c:

diff --git a/modules/drivers/cortexM4/lpc43xx/lpc4337/src/ciaaDriverUart.c b/modules/drivers/cortexM4/lpc43xx/lpc4337/src/ciaaDriverUart.c
index e12fcb6..685eac6 100644
--- a/modules/drivers/cortexM4/lpc43xx/lpc4337/src/ciaaDriverUart.c
+++ b/modules/drivers/cortexM4/lpc43xx/lpc4337/src/ciaaDriverUart.c
@@ -312,6 +312,8 @@ extern ssize_t ciaaDriverUart_write(ciaaDevices_deviceType const * const device,
       (device == ciaaDriverUartConst.devices[1]) ||
       (device == ciaaDriverUartConst.devices[2]) )
    {
+// #define ORIG
+#ifdef ORIG
       while((Chip_UART_ReadLineStatus((LPC_USART_T *)device->loLayer) & UART_LSR_THRE) && (ret < size))
       {
          /* send first byte */
@@ -319,6 +321,19 @@ extern ssize_t ciaaDriverUart_write(ciaaDevices_deviceType const * const device,
          /* bytes written */
          ret++;
       }
+#else
+      // blocking loop since ciaaSerialDevices_txConfirmation expects ret == size
+      while(ret < size)
+      {
+         if( Chip_UART_ReadLineStatus((LPC_USART_T *)device->loLayer) & UART_LSR_THRE )
+         {
+         /* send one byte */
+         Chip_UART_SendByte((LPC_USART_T *)device->loLayer, buffer[ret]);
+         /* bytes written */
+         ret++;
+         }
+      }
+#endif
    }
    return ret;
 }

I like b) option more than c), but maybe you consider a better solution. I am ready to your feedback. Thanks.

mabeett commented 7 years ago

I have been working in a bug related with ciaaSerialDevices_read with no evolution. I though appling a patch related with this solution could be valid. During the next days I will send a PR for this and I will report the posix read for UART bug.

glpuga commented 7 years ago

Have you confirmed that this is an issue? The way I understand the function ciaaSerialDevices_txConfirmation, the 0 < write < rawCount case is taken enough care of.

This is what I understand of the function:

rawCount = number of bytes on the cbuf circular buffer that are located from head position and onwards. That is, between head and tail, if the head buffer index < tail, or between head and the end of the cbuf memory buffer if the tail index tail < head.

count = total number of bytes on the circular buffer, regardless of the disposition of the head and tail indexes.

|------DDDDD-----|     |DDDDDD------DDDD|
       ^    ^                 ^     ^
       |    Tail              |     Head 
       Head                   Tail

If head < tail, then rawCount = count, but if tail < head (because of rollover of the circular buffer) then rawCount < count.

   /* If there are bytes to be sent on the cbuf buffer... */
   if (count > 0)                               
   {
      /* Try writing first those rawBytes that are consecutively on the memory buffer from the 
          head position and onwards. This may or may not be the same amount as count (total 
          number of bytes on the buffer). */
      write = serialDevice->device->write(device->loLayer, ciaaLibs_circBufReadPos(cbuf), rawCount);

      /* Update the buffer head pointer with the actual amount of byte written. This may be 
         anything such that write =< rawCount <= count. */
      ciaaLibs_circBufUpdateHead(cbuf, write);

      /* If write < rawCount, it means that the UART output buffer is full,
         so there is no point in doing anything more. The next write 
         interrupt will call this function again when more space is available. */

      /* if write == rawCount it means that the UART buffer
         had enough space for the first rawCount bytes, and might possibly have
         more space available. If there are more bytes on the buffer (that is, if 
         rawCount < count) then now try a second write attempt. */

      if ( (write == rawCount) && (count > rawCount ) )
      {
         /* Recalculate rawCount. */
         /* Write more bytes. */
         /* if write did not report and error, update the buffer head with the actual
            amount of bytes written. */
      }

      /* In any case, try reactivating the sleeping task that was waiting on this
         operation, if there is any, so that ciaaSerialDevices_write() gets to try again
         to transfer the user data buffer into the driver buffer into the space that
         we just managed to free by writing on the UART. */
      if ( (255 != taskID) &&
            (serialDevice->blocked.fct ==
             (void*) ciaaSerialDevices_write) )
      {
         /* invalidate task id */
         /* reset function */
         /* set task event */
#ifdef POSIXE
         SetEvent(taskID, POSIXE);
#endif
      }
   }
mabeett commented 7 years ago

@glpuga yes, I deployed the bug and this proposal solutions.

You may test with the program attached to this message in the ciaa-Firmware mailing list.

I don't remember the logic for the patch but I think that it could be related with write. write may be greather tan 0 but smaller than rawcount so ( (write == rawCount) && (count > rawCount ) ) is not true Anyway testing with the code anyone may deploy the bug and propuse alternative(s) patch(es).

glpuga commented 7 years ago

I don't remember the logic for the patch but I think that it could be related with write. write may be greather tan 0 but smaller than rawcount so ( (write == rawCount) && (count > rawCount ) ) is not true

@mabeett What I mean is that It is ok if ( (write == rawCount) && (count > rawCount ) ) is false. It just means that the UART HARDWARE FIFO got full before even finishing sending the bytes between the head of the driver circular queue and the end of the circular queue buffer.

That is what I meant when I commented the following rationale on the code snippet included in my last message:

...

/*` If write < rawCount, it means that the UART output buffer is full,
     so there is no point in doing anything more. The next write 
     interrupt will call this function again when more space is available. */

/* if write == rawCount it means that the UART buffer
   had enough space for the first rawCount bytes, and might possibly have
   more space available. If there are more bytes on the buffer (that is, if 
   rawCount < count) then now try a second write attempt. */

if ( (write == rawCount) && (count > rawCount ) )
{
   ...

Both candidate patches above seem to force the write == rawCount by putting the processor in a polling loop.

This is wrong, IMHO, because this is interrupt service routine code, and therefore this patch will block the execution of the rest of the system (both user and RTOS) while the driver waits in a loop trying to empty the content of the driver circular queue into the hardware UART FIFO. This may take up to several dozen milliseconds, depending on the output baudrate and on the size of the data chunk that the user is sending.

I don't doubt that the patch will seem to work when tested, but I'm afraid that it will hamper system performance and interrupt response latency times. The problem may be even worse on other firmware ports for processors with smaller UART FIFOs than the 8 or 16 bytes usually found on Cortex-M micros.

The original driver code wrote as much data on the hardware UART FIFO as it could, and then relied on the "output fifo empty" interrupt to call ciaaSerialDevices_txConfirmation again as soon as more space became available on the output FIFO.

I hope you don't mind that I insist on this issue. It's just that I from that point of view I don't really see a bug on the original code, and the patch seems a bit dangerous, real-time-performance-wise.

mabeett commented 7 years ago

I hope you don't mind that I insist on this issue. It's just that I from that point of view I don't really see a bug on the original code, and the patch seems a bit dangerous, real-time-performance-wise.

No, absolutly. There is no problem. I just want to solve a bug shared 6 months ago in the official Group.

At the moment two expensive-in-performace patches are the unique solution for the bug. I offered this 4 moths ago. During this period of time there have been no better patches.

IMO better than a bug is a system expensive-in-performance, and better than this two would be a non-expensive-in-performance system. At the moment the 3rd option does not exist.

Please, deploy the bug and help to solve it. During the month between the email and this issue I did try another options unsuccessfully. I cannot offer a better option.

A simple program writing infinitely to UART and blinking a led should not stop the system as the test program does.

Note: the IPv6 and lwip support merge is blocked by this issue. There is no sane way for debugging lwip output without UART debug messages.

mabeett commented 7 years ago

I created two issues #455 and #456 . Each one describes bugs ciaaPOSIX_write() and ciaaPOSIX_read() and has an attachment for deploying them.

This issue should be closed, but at the moment there is no solution for the bugs, an the paches wrote here at least allow using ciaaPOSIX_write. As @glpuga indicates are not apropiate implementations.

The root problem could be related with ciaaLibs_CircBuf functions and macros or the invocation. Unfortunatly at te moment I cannot now spend more time with this problems. IMO The discuss about the bugs and the candiate patches should be continued in #455 adn #456

mabeett commented 7 years ago

The cause of this problem is #457, as the tests indicated in the mailing list says .