Parse DX receipt in "sweep_device_stop_scanning", instead of sending DX command twice.

dcyoung commented 7 years ago

Sending a DX command (host->sensor) to a streaming sensor stops data acquisition. However, this doesn't happen instantaneously. Once the DX command is sent out, the host will still receive a few Data Block receipts before seeing the DX receipt.

Currently the sweep_device_stop_scanning uses the following technique:

Send DX command (host -> sensor)
Wait a while (5000 us), for any remaining Data Block receipts and the DX receipt to arrive (sensor -> host).
Flush the serial port.
Send another DX command (host -> sensor)
Read the DX receipt (sensor -> host) that returns as a response to step 4

This was really a temporary workaround and should be replaced by an actual check for the DX receipt among the incoming Data Block receipts. The initial reaction might be to assume we cannot reliably check for a DX receipt among incoming Data Block receipts because the bytes of a Data Block are variable. The fear is that they could theoretically take the appearance of a DX receipt. However, we can still reliably discern the two because:

While it is true that Data Block bytes 1-6 can take any form, byte 0 (sync/error byte) is limited to very specific error codes and is therefore controllable. We can guarantee that byte 0 will never look like the first byte of a DX receipt.
The sensor guarantees the transmission of complete Data Block receipts before sending a DX receipt. That is to say that a DX receipt will never come mid-Data Block.
The sensor guarantees that no Data Block receipts are transmitted after the DX receipt is transmitted.

Obviously we can't protect against data corruption with 100% reliability, but we can certainly protect against valid Data Block receipts being misinterpreted as a DX receipt.

In sweep_device_stop_scanning, we should do something along the lines of:

Send a DX command (host -> sensor)
Look at the incoming bytes, and discern if they belong to a Data Block receipt or a DX receipt.
In the case of a Data Block receipt, handle it... (as we would during scanning, or simply trash it). In the case of a DX receipt, we proceed as if we picked up from step 5 above.

dcyoung commented 7 years ago

Looks like the current implementation (sleep) is causing an issue. Therefore, adding "bug" label for now.

The example-c++ and example-viewer are throwing an occasional (read ~10% of the time) error on program exit.

unable to receive stop scanning command response

From some early debugging, it looks like this is because the sleep duration in the sweep_device_stop_scanning method is occasionally too short. Increasing the duration gets rid of the errors, but is NOT the proper fix. More reason to implement a proper parse in sweep_device_stop_scanning.

Note: This parsing requirement may have implications for how we design the buffer/queue for the worker thread. See #31

daniel-j-h commented 7 years ago

Care to specify how to check for values in byte 0? From what I can tell looking at

it's not yet clear to me what "specific specific error codes" look like. Especially the "reserved for future use" part is a bit hand-wavy for me.

I'm with you that what we're doing is a bit ugly waiting for some time and then trying again. But if we can simply wait 0.005s (or 2x or 10x as long - can we calculate this?) and it will just work, I'm also fine with increasing the time if it makes it easier for us.

dcyoung commented 7 years ago

Sure thing. We use the 1st transmitted BYTE in a data block to encode both the sync value and a potential error code. Since the sync value is binary, we only use the least significant BIT in the byte to encode the sync. We call this the sync bit. The remaining 7 BITS (e0:e6) can then be used to transmit an error code.

Therefore, we can parse the sync value by looking at just the LSB, and parse the error code by considering the other 7 bits as a 7 bit integer. 7 bits leaves us with 2^7 = 128 possible error codes we could send.

Currently, the device firmware only supports one kind of error (a communication error with the LiDAR module). Therefore the only error message that we currently need to parse out of the 7 higher order bits (e0:e6) is the case where e0 has a binary value of 1, indicating a communication error.

example:

    // Consider the DATA_BLOCK contents...
    // check that the error bits (e0-6) are all 0 (ie: DATA_BLOCK contains NO ERRORS)
    if ( !(responses[received].sync_error >> 1) ) {

      // Check if DATA_BLOCK is a sync reading (ie: first reading of a new 360deg scan)
      // & if we've already been gathering DATA_BLOCKs
      if ( responses[received].sync_error == 1 && received > 0) {
        // Note the end of the scan
        last = received;
        //FIXME: add check that "last" does not exceed SWEEP_MAX_SAMPLES

        // BREAK to return the scan (NOT including this DATA_BLOCK)
        break;
      }
    }
    else {
      // some error
      //FIXME: Handle each type of error code and shutdown gracefully if need be. 
      switch(responses[received].sync_error >> 1 ) {
          case ...
       }
    }

What this issue proposes, is that we never send an error code that could produce the ASCII value 'D'. We would lose 1 of our 2^7=127 possible error codes, but I think that 127-1=126 is more than enough error codes for our needs... seeing as we are only using 1 currently.

This guarantee that we'll never receive a Data Block that starts with an ASCII "D", combined with some other guarantees I outlined in the original post, is enough criteria to reliably parse the incoming data blocks for a 'DX' receipt.

daniel-j-h commented 7 years ago

Update: we implemented the error code check in https://github.com/scanse/sweep-sdk/pull/45.

We're now sleeping for 20ms and at least with my devices it seems like this is good enough.

What is your take here, should we go the route of parsing the packets and handling it ourselves? I'm more than happy with the simply workaround (as long as it works reliably).

scanse / sweep-sdk

Parse DX receipt in "sweep_device_stop_scanning", instead of sending DX command twice. #27