ge0rg / aprsdroid

APRSdroid - Geo-Location for Radio Amateurs
https://aprsdroid.org/
GNU General Public License v2.0
502 stars 96 forks source link

NEW issue with bluetooth TNC #62

Closed samwathegreat closed 11 years ago

samwathegreat commented 11 years ago

HI ge0rg,

I've discovered a new issue that didn't seem to be present until the most recent version that you patched to fix the OBJECT beacon bug.

The new issue is very strange, and involves APRSdroid sending data other than what shows in the log. When driving around town, I would notice that when APRSdroid sent a beacon, it would move my symbol to my current position on the map. THEN, almost immediately (after the local digipeater digipeats the packet), it would move my car symbol back to the LAST location that a beacon was sent from. By the time I got back, the map looked like a giant spiderweb showing that I would bounce back and forth new and old locations.

Here is what is happening: APRSdroid LOG shows that a beacon was sent showing current location. But it's NOT sending the current location. Have a look:

2013-07-12 12:59:05 received KS9N>APDR12,KS9N-10_WIDE1_WIDE2-1:=3952.16NS08454.86W>259/057/A=000974 Listening on 147.465 minus.9 T131

2013-07-12 12:59:02 Bluetooth OK KS9N>APDR12,WIDE1-1,WIDE2-1:=3951.61NS08455.04W>147/030/A=000881 Listening on 147.465 minus.9 T131

Notice that at 12:59:02, APRSdroid says that it sent a beacon from that position, but what was actually heard back from the digipeater was entirely different! Different lat/lon, and different numbers in the data part. Completely different location!

Here is another example:

2013-07-12 12:57:44 received KS9N>APDR12,KS9N-10,WIDE1,WIDE2-1:=3952.31NS08453.68W>262/053/A=000914 Listening on 147.465 minus.9 T131

2013-07-12 12:57:41 Bluetooth OK KS9N>APDR12,WIDE1-1,WIDE2-1:=3952.01NS08455.53W>130/037/A=000928 Listening on 174.465 minus.9 T131

Once again, the packet heard from the digipeater is entirely different from what APRSdroid "thought" it sent.

A quick check of APRS-IS confirms that Aprsdroid is NOT sending the packets that it says it is.

It seems to me like it's caching a packet, as the packet that it sends to the TNC seems to always be the LAST location prior that it says it sent. I'm also finding erroneous data when I check APRS-IS:

2013-07-12 11:26:57 EDT: KS9N>APDR12,K9APR-5*,WIDE2-1,qAR,W8VFR-9:=3949.31NS08453.43W>000/000/A=000702 Listening on 147.465 minus.9 T131

2013-07-12 11:27:25 EDT: KS9N>APDR12,WIDE1-1,WIDE2-1,qAR,KS9N-10:=3949.31NS08453.43W>000/000/A=000703 Listening on 147.465 minus.9 T131

2013-07-12 12:39:33 EDT: KS9N>APDR12,W8VFR-10,WIDE1,WIDE2-1,qAR,KS9N-10:=3949.51NS08453.47W>271/000/A=000817 Listening on 147.465 minus.9 T131

2013-07-12 12:42:57 EDT: KS9N>APDR12,W8VFR-10,WIDE1,WIDE2-1,qAR,W8VFR-3:=3949.51NS08453.47<0x00><0x01><0x00><0x02><0x00><0x0c><0x04><0x00>E<0x00><0x0b>?<0x01>Y<0x08><0x00>E<0x00><0x03><0xef><0x09><0xe3><0x05><0x0b><0x8d>p<0x08><0x00>E<0x00><0x03> [Invalid uncompressed location]

2013-07-12 12:42:57 EDT: KS9N>APDR12,W8VFR-10,WIDE1,WIDE2-1,qAR,WA8ZWJ-1:=3949.51NS08453.47 [Unsupported packet format]

2013-07-12 12:43:00 EDT: KS9N>APDR12,W8VFR-10,WIDE1,W8BLV,WIDE2*,qAR,W8EH:=3949.51NS08453.47<0x01><0x02><0x0c><0x04>E<0x0b>?<0x01>Y<0x08>E<0x03><0xef><0x09><0xe3><0x05><0x0b><0x8d>p<0x08>E<0x03> [Invalid uncompressed location]

2013-07-12 12:43:09 EDT: KS9N>APDR12,W8VFR-10,WIDE1,WIDE2-1,qAR,W8VFR-9:=3949.51NS08453.47KD8PZV-9>S9TQ4W,KI6SZ-2,WIDE1,WIDE2-1,qAR,W8VFR-9:oPcqTfk/"7I}_" [Invalid uncompressed location]

2013-07-12 12:44:41 EDT: KS9N>APDR12,WIDE1-1,WIDE2-1,qAR,KS9N-10:=3949.51NS08453.47W>271/000/A=000845 Listening on 147.465 minus.9 T131

2013-07-12 12:55:20 EDT: KS9N>APDR12,K9APR-5*,WIDE2-1,qAR,W8VFR-9:=3949.56NS08453.42W>000/022/A=000873 Listening on 147.465 minus.9 T131

2013-07-12 12:56:29 EDT: KS9N>APDR12,W8VFR-10,WIDE1,WIDE2-1,qAR,KS9N-10:=3950.43NS08453.34W>358/000/A=000838 Listening on 147.465 minus.9 T131

2013-07-12 12:57:14 EDT: KS9N>APDR12,W8VFR-10,WIDE1,WIDE2-1,qAR,KS9N-10:=3952.34NS08453.31W>336/018/A=000922 Listening on 147.465 minus.9 T131

2013-07-12 12:57:28 EDT: KS9N>APDR12,WIDE1-1,WIDE2-1,qAR,KS9N-10:=3952.33NS08453.48W>253/039/A=000928 Listening on 147.465 minus.9 T131

2013-07-12 12:57:43 EDT: KS9N>APDR12,WIDE1-1,WIDE2-1,qAR,KS9N-10:=3952.31NS08453.68W>262/053/A=000914 Listening on 147.465 minus.9 T131

2013-07-12 12:59:04 EDT: KS9N>APDR12,WIDE1-1,WIDE2-1,qAR,KS9N-10:=3952.16NS08454.86W>259/057/A=000974 Listening on 147.465 minus.9 T131

2013-07-12 12:59:40 EDT: KS9N>APDR12,W8VFR-10,WIDE1,WIDE2-1,qAR,KS9N-10:=3952.04NS08455.70W>284/024/A=000957 Listening on 147.465 minus.9 T131

None of these correlate to what is shown in APRSdroid logs. They all seem to be one beacon behind. I'm not understanding why or how it is doing this. Buffer problem maybe? It would appear to be something caused by the most recent update, because prior to the most recent version, this didn't appear to be happening.

Also notice the "Invalid uncompressed location" and "unsupported packet format" errors from APRS-IS.

Please also notice that it APPEARS that part of a received beacon was "injected" into one of my outgoing packets. 2013-07-12 12:43:09 EDT: KS9N>APDR12,W8VFR-10,WIDE1,WIDE2-1,qAR,W8VFR-9:=3949.51NS08453.47KD8PZV-9>S9TQ4W,KI6SZ-2,WIDE1,WIDE2-1,qAR,W8VFR-9:oPcqTfk/"7I}_" [Invalid uncompressed location]

Notice how this looks like my packet with "KD8PZV"'s packet attached? ^This is straight from aprs.fi

Please let me know if you need additional information to diagnose this. It seems very strange to me, so I don't know off hand what additional information would be useful.

Thanks,

samwathegreat

samwathegreat commented 11 years ago

More code shows up on aprs-is than what shows here. When I copied/pasted it here it changed some of the code from the invalid packets into ascii. If you could go to aprs.fi and click "raw" and enter "KS9N", you'll see the invalid packets in their entirety.

Having read the KISS protocol write-up at http://www.ka9q.net/papers/kiss.html I'm beginning to think maybe APRSdroid isn't sending the appropriate "frame end" commands, but I haven't a clue. The fact that it appears another packet got tacked on to the end of one of mine leads me to think there could be a buffer issue.

Another strange thing I noticed is that the packet that seems to have gotten transmitted along with mine (the 12:43:09 packet) --the one from KD9PZV-- Checking aprsdroid's logs shows that APRSdroid doesn't think it ever heard that packet. It does, however, show that it heard other packets from KD9PZV. Is there a possibility that received data could make its way into the TX buffer?

samwathegreat commented 11 years ago

More Info:

I took another drive today, and did not experience the erroneous packet issue. Each time APRSdroid sent a packet, the info heard from the digipeater matched it exactly.

Also, upon further research, I've determined that the packets APRSdroid sent have NO correlation to previous packets (I tried to see if the erroneous packet heard from the digipeater matched a previous beacon, and it did NOT. The data was completely unique.

At this point, I'm thinking that perhaps there was more than one instance of APRSdroid running. I've noticed that APRSdroid does NOT exit cleanly. If you have been using APRSdroid for a bit, checking the log, settings, viewing map, etc., you can press the back arrow DOZENS of times before APRSdroid finally exits! I've also noticed that it seems to cache beacon data well beyond the timeframe set in the settings.

For instance, I have "SHOW LAST 2 HOURS" set. But if I press back arrow a few times, it will bring back up an OLD map (hours or even days old) with stations that were heard many hours or even days ago.

Is it possible that somehow there were 2 instances of APRSdroid tracker running? This is as close as I can come to an explanation of what happened. It's the only theory I can come up with that would explain why the digi heard completely different information than what APRSdroid thought it sent.

If this is the case, what would your recommended solution be? Obviously, I need to make sure I press back arrow enough times to completely exit aprsdroid when I'm finished with it.....an "EXIT" button may be helpful.

I'm happy that the problem did not re-occur, but I fear that it will happen again. Any Suggestions?

samwathegreat commented 11 years ago

The issue re-occured today. Before I began tracking, I made absolutely sure that I had completely exited APRSdroid.

The tracker worked normally for most of my trip (drove a little over an hour from home), but upon return, I'm seeing the same issue.

Once again, the map looks like a spiderweb showing me bouncing back and forth between new and old locations. Here are two more examples:

2013-07-15 12:07:21 received KS9N>APDR12,KS9N-10,WIDE1,WIDE2-1:=3949.63NS08454.31W>082/030/A=000813 En-route to Indy. Catch me on 146.520

2013-07-15 12:07:18 Bluetooth OK KS9N>APDR12,WIDE1-1,WIDE2-1:=3949.51NS08453.47W>239/005/A=000832 En-route to Indy. Catch me on 146.520

Once again, the data heard from the digipeater is completely different, and APRSdroid is NOT sending what it says it is. Even the altitude is different!

Another example:

2013-07-15 12:04:32 received KS9N>APDR12,KS9N-10,WIDE1,WIDE2-1:=3951.81NS08455.24W>132/039/A=000921 En-route to Indy. Catch me on 146.520

2013-07-15 12:04:29 Bluetooth OK KS9N>APDR12,WIDE1-1,WIDE2-1:=3949.62NS08454.47W>128/009/A=000815 En-route to Indy. Catch me on 146.520

Same deal = completely different data.

I think this would seem to confirm that it was NOT a fluke, and there is some kind of bug causing this unpredictable behavior.

Anyone have any insight?

samwathegreat commented 11 years ago

The issue continues to occur. I realize that my descriptions may be lacking, so I took some screenshots that I think may help identify the problem. The photos show the "spiderweb" effect that is happening, and also a screenshot of a mismatched digipeated packet. Happily, the object beacons work fine as you can see. screenshot_2013-07-20-01-17-18 screenshot_2013-07-20-01-17-49 screenshot_2013-07-20-01-17-42 screenshot_2013-07-20-01-17-32 screenshot_2013-07-20-01-16-49 screenshot_2013-07-20-00-54-48

samwathegreat commented 11 years ago

Attached is a view from APRS.fi of the same trip shown above. Notice that the packets seem to be sent normally. Hope this helps.

samwathegreat maptrack

ge0rg commented 11 years ago

Thank you very much for the extensive report. I am looking into it and trying to reproduce. I'll update the issue as soon as I have figured out the cause.

samwathegreat commented 11 years ago

Hi Georg,

I have some more data for you. It does now appear that this is some kind of buffer issue.

Consider this: I'm using a KISS TNC with digital carrier detect. Like all KISS TNCs, it has a transmit buffer. If the channel is not clear, it will hold the packet in buffer until channel goes clear. Have a look at these screenshots:

screenshot_2013-07-27-13-11-37 screenshot_2013-07-27-13-11-02

Notice that at 00:11:55, aprsdroid thought it sent a packet, but no reply was heard from digipeater. Then, at 00:12:56 aprsdroid tried to send another packet, but what was heard back from the digipeater was EXACTLY what was sent at 00:11:55.

This leaves me with 2 possibilities: seems that either aprsdroid is "unaware" of the kiss TNC buffer, and therefore doesn't realize that the TNC may buffer the packet until the channel clears, which may or may not be before the next packet goes out.

Alternatively, if this isn't it, I'm wondering if it is possible that aprsdroid is "forgetting" to send a kiss "frame end" coommand? As I understand it, a kiss frame is sent with a leading frame end character, and another frame end character is sent afterwards. Is it possible that aprsdroid sends [frame end] [packet] but leaves out the trailing frame end? If so, then the next packet sent (which has a [frame end] at the beginning) could cause the previously qeued packet to be sent? I'm no Dev, so this is all conjecture. Just trying to figure it all out.

samwathegreat

ge0rg commented 11 years ago

Can you verify that your radio transmits immediately after APRSdroid sends the packet?

The code from APRSdroid KISS TNC looks correct:

    os.write(Kiss.FEND)
    os.write(Kiss.CMD_DATA)
    os.write(p.toAX25Frame())
    os.write(Kiss.FEND)
    os.flush()

Other than that, APRSdroid does not know anything about the TNC's buffer. It just sends its packets via BT whenever these are generated.

samwathegreat commented 11 years ago

Hi ge0rg,

I have confirmed that my radio transmits as soon as the channel clears after having been sent a packet.

In order to avoid collisions, most TNCs use a carrier detect to wait until the channel is clear before transmitting. Most digipeaters (especially higher altitude digis) have to wait even longer before digipeating packets because they hear so many packets. Such is the nature of packet radio, especially APRS, where everyone uses a single shared frequency.

What I'm getting at here, is that say my radio has to wait 10 seconds before it can send the packet - by this time, I've made another turn, and another packet is queued. When the radio keys, both queued packets get sent out. Most digis will discard the 2nd packet as a duplicate, and digipeat the first packet. Now my map shows that I jumped from my current location to the last location that was digipeated.

Alternatively, say my radio transmits the packet instantly. The digipeater is hearing all kinds of packets, and so doesn't get an opportunity to transmit for several seconds. By then, aprsdroid has sent a new location, so when it hears the older packet from the digipeater, the location is again, reverted.

This is nothing special about my setup - all TNCs have a buffer of some size, and all TNCs SHOULD wait for channel to clear before transmitting. In areas where APRS is congested this becomes a real problem.

The solution seems rather simple to me - why not have APRSdroid NOT plot received location data to the map for the callsign/ssid combo that is being used by aprsdroid. I would still like to see the received packets show up in the log for logging purposes, but NOT have the received packets for own callsign plotted. When aprsdroid "sends" a packet, it plots the location on the map, and since any particular callsign/ssid combo should ONLY be used on one device at a time, I don't see how this could have any negative effects. Certianly, it would stop the "spiderweb" effect from happening.

This was a trip I made today. Notice the two "triangle" shapes? This was caused by the reverted packet issue described - I did NOT drive around in a triangle.

screenshot_2013-07-30-15-49-27

samwathegreat

ge0rg commented 11 years ago

Thank you for the analysis. I've created a build that should not fully parse digipeated packets any more. Instead, the user can get a notification about them.

Please test and let me know if it helps and/or if there are other side effects.

samwathegreat commented 11 years ago

Hi ge0rg,

So far, so good. This version seems to have eliminated the problem.

I'm not too fond of the new larger font size for the OSM map labels (map labels [city names, etc] are now about 2x previous size), but that is very minor and if you prefer the labels larger, I'm fine with it.

I will test for a couple more days, and if everything seems fine, I'll comment & close this issue.

Again, I really appreciate your attentiveness with getting these bugs fixed. At this point, I'm hopefully optimistic that APRSdroid is now 100% fully functional in TNC / OSM mode.

ge0rg commented 11 years ago

That is great to hear. Let me know if you find more problems. I hope to be able to make a 1.2.3 release soon, otherwise.

Regarding the font size, I noticed that reading the city names when having the phone in a car dock is almost impossible, and then I added a scaling factor according to the phone's DPI...

samwathegreat commented 11 years ago

Having tested this version for several days, I feel that it is ready for release. I found no unexpected side effects, and everything seems to work well!

I'd say you're ready to release 1.2.3!

That being said, this issue is now closed.