Open joeman155 opened 10 years ago
hi Joe, I haven't been able to reproduce this, and I've done a lot of long full-rate transfers with zero packet loss. Do you have hardware flow control enabled? I found that to do large transfers reliably I really need flow control. Cheers, Tridge
Rewriting my comment from scratch.
I have a Beagle Bone Black with a 3DR running 1.9 connected to a serial port, talking to a MacBook Pro with a 3DR radio on a USB port.
Config: ATI5 S0:FORMAT=25 S1:SERIAL_SPEED=115 S2:AIR_SPEED=64 S3:NETID=25 S4:TXPOWER=20 S5:ECC=1 S6:MAVLINK=1 S7:OPPRESEND=1 S8:MIN_FREQ=915000 S9:MAX_FREQ=928000 S10:NUM_CHANNELS=50 S11:DUTY_CYCLE=100 S12:LBT_RSSI=0 S13:MANCHESTER=0 S14:RTSCTS=1 S15:MAX_WINDOW=131 RTI5 S0:FORMAT=25 S1:SERIAL_SPEED=115 S2:AIR_SPEED=64 S3:NETID=25 S4:TXPOWER=20 S5:ECC=1 S6:MAVLINK=1 S7:OPPRESEND=1 S8:MIN_FREQ=915000 S9:MAX_FREQ=928000 S10:NUM_CHANNELS=50 S11:DUTY_CYCLE=100 S12:LBT_RSSI=0 S13:MANCHESTER=0 S14:RTSCTS=1 S15:MAX_WINDOW=131
If I pump characters from the Mac to the BBB as fast as I can, I get no lost data. If I do the same from the BBB to the Mac I lose a lot of data.
When I put Joe's patch on the BBB 3DR, I lost much less data.
Looking at the signals on the BBB with a login analyzer, they are bizarre -- CTS gets set high but only for 100nS, always just after a rising edge on RX -- clearly something is wrong with my setup. I'll investigate further.
I didn't have a pullup on CTS -- now I do, it looks sensible, although I think perhaps the BBB was seeing it correctly before, as I don't see a change in behaviour: sending from the BBB to the MBP still loses about 1/1000 bytes, with Joe's patch.
The BBB seems to send as many as 16 bytes after CTS goes high, which should be OK, looking at the threshold and the buffer size.
I can't remember if I had hardware flow control enabled or not.
I did have issue even when just doing RTI5 i.e. it would return 'bits' and 'pieces' ...at random. e.g. miss a whole bunch of lines, or just not get the last XX set of lines. I assume that means the 'control' settings at the 'far' would have no bearing in this scenario.
Would anyone mind trying the following test (or its equivalent)? At a5740ee058a69643ab4a647921098fa639eadf47 I see intermittent data loss (sequence of missing contiguous bytes) with these settings. I should think that 32Kbps would handle a full rate 9600 bps stream with ECC turned on, even with no RTS/CTS.
#define RADIO_BAUD 9600
#define RadioSerial Serial
char buf[] = {
0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0x10,
};
void setup()
{
RadioSerial.begin(RADIO_BAUD);
}
void loop()
{
RadioSerial.write((uint8_t *)buf, sizeof(buf));
}
S0:FORMAT=25 S1:SERIAL_SPEED=9 S2:AIR_SPEED=32 S3:NETID=22 S4:TXPOWER=20 S5:ECC=1 S6:MAVLINK=0 S7:OPPRESEND=0 S8:MIN_FREQ=915000 S9:MAX_FREQ=928000 S10:NUM_CHANNELS=50 S11:DUTY_CYCLE=100 S12:LBT_RSSI=0 S13:MANCHESTER=0 S14:RTSCTS=0 S15:MAX_WINDOW=131
@joeman155, the application note at http://www.silabs.com/Support%20Documents/TechnicalDocs/AN440.pdf (page 7) confirms that with the threshold at 60, itxffafull will not go high until after crossing 60. I also wonder how much latency there is in the status flag updates.
@tgdavies , when you ran your BBB test, it looks like you had both ends the same serial baud rate? If so, running a full rate stream could cause buffer overflows without RTS/CTS since one serial clock will be running slightly faster than the other. Even so, I agree that in theory there should be ways to set up the radios where a full rate stream ought to work without RTS/CTS.
@ldslaron 1 character in 1000 (which is what I was losing) does seem consistent with a small clock mismatch -- end would explain why I only saw the loss in one direction. I believe that I had hardware flow control working on both ends (once I had configured my pullup correctly on the BBB) so I would expect not to lose characters even with the mismatch.
@tgdavies, yes, I feel like there may still be a problem in the codebase that causes intermittent data loss. I've been experimenting with the use of a ping pong buffer in the RX interrupt handler so the TDM code doesn't have to service the buffer before the next packet starts coming in, but I'm not sure if that is the problem.
@tgdavies - I created a possible fix and entered a pull request; you might see if it solves your problem. I sent ~120 MB of data without any apparent loss using acc739d721c10513659686374dc1b92932442724. The code could use more testing, but hopefully it is an improvement.
@ldslaron Thanks -- I will try that when I give this project another timeslice and let you know how it goes.
Hello, Ignore the commit to serial.c. The pull request is being sent because of changes to radio.c
I've found that I got TX errors without the suggested changes. (I was sending large packets of data...132 byte x-modem packets)
I noticed comments about how the FIFO was 'sensitive' about how fast data was put in...so this is why I tried making changes here...thought we might still be pushing it a bit too much.
I thought I'd try and guess what was going on. Perhaps the issue might be be related to the fact that it jumps from 60 to 64 and somehow...it can't deal with numbers that require 7th bit. So rather than trying to reduce the almost full threshold...I thought I'd increment by one...so it gets to 61 (and never reaches 64) and THEN the radio starts transmitting.
NOTE: The docs : http://www.silabs.com/Support%20Documents/TechnicalDocs/Si1000.pdf (page 264) says:-
"When the data being filled into the TX FIFO crosses this threshold limit,an interrupt to the microcontroller is generated so the chip can enter TX mode to transmit the contents of the TX FIFO."
Note the word 'crosses'. i.e. Being at 60 does not trigger it. So if we pile in enough data quickly...we get it up to 64.
Anyhow, I'll leave it at that. It 'seems' to resolve issues...but I can't explain it.