ghbolivar / android-on-freerunner

Automatically exported from code.google.com/p/android-on-freerunner
0 stars 0 forks source link

Bad GPS timestamps #139

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Open any GPS application of your choice (tested: GPS Status, OSM Tracker)
2. Get a fix and let the device receive GPS signals for a while
3. Look at the time reported (in GPS Status, watch the screen; in OSM
Tracker, trace for a while and then examine the GPX file)

What is the expected output? What do you see instead?
On screen, I would expect either the GPS time (UTC with a few seconds of
offset, due to the occasional leap second inserted in UTC but not in GPS
time) or local time. (Test were taken in Central Europe during DST, so
local time is 2h ahead of UTC.) Instead I see that the timestamp is several
hours off and occasionally skips forward and back again, short of an hour:
Around 19:15, GSM Status would display times around 15:15, then suddenly
skip to 16:15, then back again. At around the same time the next day, OSM
tracker would record timestamps around 13:15, then suddenly jump forward
(57 minutes in one case, 56 in another). So apparently GPS Status purports
to display local time while OSM Tracker records UTC or GPS time; under that
assumption the time zone correction seems OK. However, both apps are
experiencing the same incorrect time offset and skips. Needless to say, I
can't use my Freerunner for mapping that way as I have no reliable
timestamps for geotagging photos.

What version of the product are you using? On what operating system?
AoF 0.2.0 Release Candidate 1

Please provide any additional information below.
Tests were conducted last week, in northern Italy, at roughly 9°E. Hence
proximity to a time zone border or to a DST switchover date can be ruled
out as sources - it looks a lot as if the GPS timestamps are just plain
botched. Suspicion: either something is trying to correct when there is
nothing to correct, or some coding error in the GPS layer is inadvertently
messing with the timestamps.

Original issue reported on code.google.com by mich...@vonglasow.com on 24 Apr 2010 at 7:04

GoogleCodeExporter commented 9 years ago
I guess we need to fix time-setting bug #67 first...

Original comment by skliarie@gmail.com on 24 Apr 2010 at 10:04

GoogleCodeExporter commented 9 years ago
well... GPS applications are supposed to take the time reported by the GPS 
receiver,
not system time (because there is no guarantee that the system time is anywhere 
near
accurate). That should be fairly simple but even here something seems to be 
going
wrong. Unless the apps are introducing some error (which I consider unlikely 
since
they're both experiencing the same problems), I would say this is a fault in 
the GPS
components in Android - which should not do any filtering, conversions or 
calculation
but just report what it gets from the GPS receiver, and already here something 
seems
to be going wrong. I would say we can't fix bug #67 (at least not by fetching 
time
from GPS) without fixing the GPS timestamps.

And by the way: the local time on my Freerunner display has no problems at all, 
other
than that I have to set it manually and it is somewhat off because of that.

Original comment by mich...@vonglasow.com on 25 Apr 2010 at 3:27

GoogleCodeExporter commented 9 years ago
another thing I have noticed is that the time "skips" seem to occur mainly in 
areas
with poor GPS reception - the other day I was tracking in inner city streets for
about two hours and, when opening the GPX file in JOSM, it looked like a spider 
web
until I limited the maximum length for connection lines between GPX points.

Original comment by mich...@vonglasow.com on 3 May 2010 at 9:34

GoogleCodeExporter commented 9 years ago
looking at it again after some time... afaik the OS talks to the GPS receiver 
using NMEA. NMEA is based on messages which are repeated and each of which 
supplies a different kind of information (the position itself, satellites in 
view, signal and fix quality etc.).

Timestamps are included with more than one message, but unfortunately the 
format differs - some messages include milliseconds while others don't. (The 
gpsd website contains a huge rant on that.)

Given that areas with bad reception are particularly prone to incorrect GPS 
time I would conclude that Android uses multiple NMEA messages to get the time 
but with a "preference" for certain messages. In areas with bad reception, some 
messages get omitted (or supply no timestamp), hence other messages lower in 
the preference list are used. Apparently the code that processes the messages 
returns corrupted timestamps for at least one of the messages lower down in the 
list - which would explain the "skips" and their frequency in poor reception 
areas.

Original comment by mich...@vonglasow.com on 24 Jul 2010 at 2:19

GoogleCodeExporter commented 9 years ago
I did some tests last night and also had a look at the code (gps_freerunner.c 
in freerunner_platform_hardware_hw is where it's at).

Local time was around 1:15 AM (CEST, i.e. UTC +02:00); the GPS status clock 
flipped between 21:15 and 22:15. That is, it was 3 to 4 hours behind local 
time; 2 hours behind local time would have been correct.

The code contains yet another example of a workaround for the deadly sin the 
designers of the C library made: there is a gmtime function which converts 
epoch times to a structure, the latter expressed in UTC, and there is 
localtime, which does the same but additionally converts to local time. For the 
opposite way there is mktime, which takes a structure, interpreting its members 
as LOCAL time, and converts it back into epoch time - thus it's the inverse of 
localtime. But there is no such counterpart for gmtime, which would allow 
converting a structure containing UTC to epoch time. Just what were they 
thinking...

The code compensates for this by using mktime for backward conversion, 
calculating the UTC offset of the local time zone and adding that to the result.

However... line 279 calculates the time zone offset as:

    r->utc_diff = time_utc - time_local;

Hence in my case this would have been -02:00 (which is the opposite of how time 
zones are usually expressed).

When processing a timestamp, the members of a tm structure are filled and the 
time is calculated in line 328 as

    fix_time = mktime( &tm ) + r->utc_diff;

Let's break this down with my example: At 01:15 local time, the tm structure 
gets 23:15. Since it interprets that as local time, mktime returns the epoch 
time for 21:15 UTC. Then the previously calculated offset (-02:00) is added and 
we get 19:15 UTC.

If we assume that GPS Status now uses localtime on that timestamp, it would 
display 21:15, just what we observed before.

That still doesn't explain why the timestamp occasionally flips 1 hour forward 
and then back again, I suppose we have both lice and fleas and there's another 
error somewhere.

I do see that the time zone offset is calculated only once when the receiver is 
initialized - this may become a problem if the time zone changes, e.g. when 
using the GPS receiver during DST switchover. It would be better to recalculate 
the offset every time we get a timestamp. Still not the nicest way as race 
conditions are possible, but at the very worst we'd get one botched timestamp 
and the next second we'd be OK again.

I'll work out a fix for what I've found so far...

Original comment by mich...@vonglasow.com on 25 Jul 2010 at 2:03

GoogleCodeExporter commented 9 years ago
as for the flipping forward and backward and my previous suspicion about 
different NMEA messages: the code in question extracts the timestamps from GGA, 
GLL, RMC and ZDA messages. For the first three it considers the timestamp only 
if a valid fix is reported; ZDA is a specific message for supplying time and 
receiving a ZDA with valid data means the receiver has the accurate time.

The GGA and GLL messages contain timestamps with second precision, i.e. 123519 
for 12:35:19 UTC. RMC and ZDA have millisecond precision, e.g. 225446.33 for 
22:54:46 and 330 milliseconds.

That means that we may encounter two different timestamp formats, depending on 
the messages from which they were extracted, though both are largely similar. 
The code processes them from left to right using fixed widths, so any 
differences would affect only the seconds.

But... nmea_reader_update_time handles a special case in case we don't have a 
date from GPS yet: in that case it calls now() and converts the result into a 
tm structure, from which it then obtains the year/month/day values. The very 
same tm structure is then used to calculate the timestamp.

The year/month/day/hour/minute/second members are set correctly but the 
tm_isdst member never gets set explicitly. That member specifies if the tm 
structure is to be interpreted as standard or daylight savings time. In time 
zones that have DST, the difference is one hour - which would explain the extra 
hour of leap. It's only a guess, but using an uninitialized (and thus random) 
value for this is not very nice...

Original comment by mich...@vonglasow.com on 25 Jul 2010 at 5:00

GoogleCodeExporter commented 9 years ago
Here is the patch for what I found so far. I also cleaned up some stuff in 
nmea_reader_update_utc_diff: rather than calculating epoch time manually from 
the tm structure, I feed it through mktime again. Reason 1: the old code had at 
least one bug in that it did not consider leap years, which may cause problems 
on certain days. Reason 2: the result calculated here is for correcting values 
obtained through mktime, hence consistently using mktime gives us a greater 
chance of any bugs or misunderstandings canceling each other out.

Unfortunately I don't have a build environment, so I need to rely on someone to 
merge and compile the whole thing for me. I will be more than happy to test it 
then and share my results.

Original comment by mich...@vonglasow.com on 25 Jul 2010 at 5:30

Attachments:

GoogleCodeExporter commented 9 years ago
Michael,

Thanks for doing the digging and elaborating on the bug. I can try to build 
your stuff. I understood you're patch is against AoF 0.2.0 RC1. Any additional 
instructions I should take into account?

I suggest we continue this on the mailinglist.

Niels.

Original comment by niels.he...@gmail.com on 26 Jul 2010 at 9:17

GoogleCodeExporter commented 9 years ago
I created the patch against a git clone of the cupcake branch created this 
weekend, but I see the code has not been touched in 7 months, hence 0.2.0 RC1 
should have the same code inside. Altogether it should be quite 
straightforward, apply the patch and build in the same way RC1 was built.

As for the mailing list, I'm not signed up (I try to keep mailing list 
memberships low to avoid bursting my inbox), so I suggest keep information in 
this ticket. I suppose it's not much more anyway, and it helps us keep all 
relevant information in one place.

Original comment by mich...@vonglasow.com on 26 Jul 2010 at 4:55

GoogleCodeExporter commented 9 years ago
Michael,

I've built your code and created a tar archive containing the files you can use 
to deploy to your AoF (files attached).

Please check if this resolves the issue you've found.

During the git apply procedure, I got an whitespace error warning back. Maybe 
you can beautify your code to get rid of this?

If we can get things to work, we can provide the patch to Jim Ancona (in cc) 
for code review and approval.

Keep us posted how it turns out.

Niels.

Original comment by niels.he...@gmail.com on 28 Jul 2010 at 7:34

Attachments:

GoogleCodeExporter commented 9 years ago
Just dropped in the new library and took a quick glance at the results - they 
look promising. GPS Status now reports current local time and I haven't 
experienced any flipping of timestamps.

However, since the error was a sporadic one, I would like to rule out the 
possibility that I was just lucky during the quick tests I made and try it over 
a longer period of time. As I'm going on a weekend trip, I'll just track the 
trip and analyze the timestamps I get from the trace; if they are OK, we can 
consider the issue fixed.

As for the beautification, I have already done my best to keep the code as 
beautiful as possible - what kind of whitespace error is it? My first guess 
would be inconsistent use of tabs and the corresponding number of 
whitespaces... unfortunately, many editors don't care about that. I'll take a 
closer look at it.

Original comment by mich...@vonglasow.com on 28 Jul 2010 at 5:44

GoogleCodeExporter commented 9 years ago
OK, I'm back from my trip and it looks like the patch fixes both bugs - the 
offset and the occasional leaps. I also had a look at the whitespace - indeed 
the auto-indent feature in Notepad++ replaces whitespaces with tabs. Here's a 
beautified (but otherwise unchanged) patch. Enjoy!

Original comment by mich...@vonglasow.com on 1 Aug 2010 at 7:15

Attachments:

GoogleCodeExporter commented 9 years ago
Michael

Thanks for the fixing and testing.

Jim,

Could you review the patch and (if OK) submit the patch to the Freerunner GPS 
hardware code? The same patch should be applicable to cupcake and master.

Thanks,

Niels.

Original comment by niels.he...@gmail.com on 2 Aug 2010 at 8:19

GoogleCodeExporter commented 9 years ago
Patch applied to cupcake and master. Thanks for your contribution! 

Once the fix is verified in a build, we can close this issue.

Original comment by scarhill on 2 Aug 2010 at 4:46

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I do not think this works in its entirety.
With a bit of time on my hand I have written a new version of the once popular 
YGPS app, which you can download at 
http://www.yunnanexplorer.com/download/androidapps/. This now works on cupcake.

What I see is a frequent toggle of the date in the GPS reported time. I am 
using a freerunner where I have not set the date after reflashing, so the 
device date is 1 Jan 2000. I have not investigated this in detail, but I 
believe the problem is that the date is taken from the time function (which 
will give device time) in certain instances and no adjusted to GPS date. This 
will of course only show up when the device date is no set to the correct date.

The app mentioned above (you can also find the GPL licensed source code there) 
will show the problem.

Original comment by LudwigBr...@gmail.com on 17 Aug 2010 at 8:45

GoogleCodeExporter commented 9 years ago
@Ludwig: can you provide some sample timestamps? (Maybe a short GPX trace...) 
What exactly changes? Just the date part or the entire timestamp? Always by the 
same amount, back and forth?

As far as I remember, in fact the first fixes might not have a date yet (I 
think some NMEA sentences provide just the time but not the date). I've seen 
similar behavior in other GPS devices, using local time until a full GPS 
timestamp is available.

I will check to rule out any possibilities of local date being used after it 
has been obtained from GPS; we also might want to make sure we don't send a 
valid timestamp until we've received also the date from GPS. This may increase 
the time until we get a valid timestamp but would ensure we get correct 
information in any case.

Original comment by mich...@vonglasow.com on 17 Aug 2010 at 6:31

GoogleCodeExporter commented 9 years ago
I think I found the problem, a patch is attached.

The issue was that the code reads beyond the end of the year token in the 
following code: 
-    r->utc_year = str2int(tok_y.p, tok_y.end+4);
(.end is the end of the token, so .end + 4 points four chars beyond the end.)

This should be:
+    r->utc_year = str2int(tok_y.p, tok_y.p+4);
(or it could possibly be .end, but I have not tested that)

The result of this is that str2int goes into fail mode, returning -1, which 
means it is treated as uninitialized in the next round, which means it will be 
set from time(), which has the (possibly wrong) device time and date.

I rebuilt with the patch attach and the issue goes away. I tested with an 
additional logging entry that tested when year was changed and ran for a while. 
The error did not show up in log again.

I think the bug is obvious and I do not think the patch has any side effects. 

Patch should be applied to cupcake and master.

Original comment by LudwigBr...@gmail.com on 18 Aug 2010 at 4:17

Attachments:

GoogleCodeExporter commented 9 years ago
Hi Michael / Ludwig,

Thanks for doing the digging on this topic.

I've applied the patch to my local branches and it compiles without any issue.

Question: should we apply both patches 
0001-fix-for-date-toggle-by-reading-beyond-end-of-year-fi.patch by Ludwig and 
android-on-freerunner_139.patch by Michael? Or is the last patch overruling the 
first?

Thanks for letting us know.

Niels.

Original comment by niels.he...@gmail.com on 18 Aug 2010 at 7:35

GoogleCodeExporter commented 9 years ago
Both patches. I started out from a version with Michael's patch already 
applied. They are complimentary, fixing different problems.
Ludwig

Original comment by LudwigBr...@gmail.com on 19 Aug 2010 at 12:29

GoogleCodeExporter commented 9 years ago
Jim,

Could you review the patch by Ludwig and (if OK) apply to master and cupcake?

Thanks,

Niels.

Original comment by niels.he...@gmail.com on 19 Aug 2010 at 7:06

GoogleCodeExporter commented 9 years ago
Applied Ludwig's patch to cupcake and master.

Original comment by scarhill on 21 Aug 2010 at 10:30

GoogleCodeExporter commented 9 years ago
sorry for joining back in so late, I was on the road for a couple of days...

Seems Ludwig has found yet another bug, but there are some things in the NMEA 
code that I'm still not quite happy with.

First, we give out timestamps even when we have only a partial timestamp (that 
is, hh:mm:ss but no date) from the GPS and substitute the local system date. 
While Ludwig's case (system clock nowhere near correct) may look like a highly 
unlikely special case, the same situation may lead to problems when the system 
clock is only slightly off and we get our first fix around midnight - we may 
end up with an incorrect date. This may cause problems e.g. when we sync 
against GPS time (as I suggested in ticket #67, waiting for the first GPS fix 
and using its timestamp - the one timestamp we use for synchronization may be 
faulty and we'd end up with an error of 24 hours rather then just a few 
minutes).

Second, the lack of a mktime function capable of processing UTC input may still 
have some side effects: we are working on time differences, which is correct in 
most cases but may introduce errors in the immediate surroundings of a 
DST/Standard Time switchover, resulting in timestamps being one hour (or 
whatever the DST difference is) ahead or behind.

My solution to the first is not to return a date until we have received a full 
GPS timestamp (may reduce startup time, typically by one second, but I don't 
think this is very painful to the user).

For the second one, I have seen a in implementation of a mkgmtime function, 
which converts a tm struct in UTC to a timestamp - basically mktime without any 
time zone related processing. I would add that to the code and use it; there 
are just two caveats:

Caveat 1: if the definition of time_t changes (likely as we approach the Y2K38 
problem), we will need to adapt mkgmtime accordingly. Still I suppose the risk 
is calculable.

Caveat 2: the code I'm referring to is originally from the Lynx browser and is 
thus under the GPL. If we include this in our code, it would cause the entire 
GPS driver code would fall under the GPL rather than the Apache license. For me 
personally, that's fine (and I guess most people here would agree), but I would 
still like some opinions on this.

Original comment by mich...@vonglasow.com on 22 Aug 2010 at 1:01

GoogleCodeExporter commented 9 years ago
Ludwig / Michael,

We really appreciate the analysis and contributions. We're very happy to see 
that members of the community start submitting patches to the source code.

You guys obviously know what you're talking about and have the skills to fix 
bugs. We therefore consider to give you commit rights to the source code. The 
main constraint is that you have a local build environment and that you test 
your patches before submitting / committing.

You can get in touch with Jim Ancona to do the configurations for you. In case 
you want to discuss, feel free to mail or talk on IRC.

Original comment by niels.he...@gmail.com on 22 Aug 2010 at 8:03

GoogleCodeExporter commented 9 years ago
oh yes, another thing I noticed recently: the reported speed is roughly twice 
the actual speed. The RMC sentence returns speed in knots (nautical miles per 
hour, 1 knot equals slightly more than .5 m/sec), so I suppose the unit 
conversion is missing and we're misinterpreting 10 knots as 10 m/sec. I'll take 
a look also at that (it's not like this is the first GPS project to trip over 
that sort of thing...)

Original comment by mich...@vonglasow.com on 29 Aug 2010 at 3:22

GoogleCodeExporter commented 9 years ago
Update: I have produced a first improved version in which I fixed the errors I 
noticed:

- Timestamps are now calculated using mkgmtime and are thus immune to any 
timezone quirks.
- nmea_reader_update_time will not set the time unless a valid date has been 
set first. No system time will be used as a substitute for missing date/time 
information.
- Only RMC and ZDA messages are used to update timestamps: these supply 
fractional seconds and a date. (ZDA is rarely ever sent by GPS receivers, not 
sure about the Freerunner, but RMC is *the* GPS message. See the rant on the 
gpsd home page for more information.)
- Timestamps from GGA and GLL messages are ignored: They do not contain a date 
and using them may cause system time to jump back by 24 hours and forward again 
around midnight. I should add some extra code to ignore these messages 
altogether until we got the first valid timestamp, else weird things may 
happen...
- I also introduced conversion for the speed, as NMEA uses knots and Android 
internally uses meters per second.
- Fractional seconds are still ignored but I'm planning to include them as well 
- juts discarding the fractional part introduces an extra error of .5 +/- .5 
seconds, that is, up to one second.

A quick test on my balcony looked OK (meaning I got a reasonable time), though 
I could not test the speed for correctness.

Original comment by mich...@vonglasow.com on 29 Aug 2010 at 7:02

GoogleCodeExporter commented 9 years ago
Added the code to ignore GGA and GGL until the first timestamp has been 
received. So far, so good.

But when I try to add millisecond precision to the fix time by changing

    r->fix.timestamp = (long long)fix_time * 1000;

to

    r->fix.timestamp = (long long)((fix_time * 1000) + (int)(fmod(seconds, 1) * 1000.0));

all of a sudden I get a weird time stamp - instead of 23:00 I get something 
shortly after 13:00. (Unfortunately the GPS Status app does not show the date, 
so I can't reliably tell the numeric error.)

I've tried various ways of adding the fractional second part to the time stamp 
but they all have the same result - timestamps which are off by roughly the 
same amount. What black magic did I fail to apply here?

Original comment by mich...@vonglasow.com on 29 Aug 2010 at 9:13

GoogleCodeExporter commented 9 years ago
for the impatient, here's the state of my work so far (I'm not set up for 
commit access yet, will take care of that in the next days).

Should work but is not fully tested yet: Timestamps look OK and should in 
theory be correct (we're now entirely free from system clock time), but doesn't 
process fractional seconds yet. Speed should work in theory but is untested.

The archive contains the source patch as well as the updated library (for 
immediate replacement on a Freerunner setup).

Input on the millisecond issue is still welcome.

Original comment by mich...@vonglasow.com on 29 Aug 2010 at 10:04

Attachments:

GoogleCodeExporter commented 9 years ago
The millisecond issue is solved: since time is parsed from the NMEA string and 
then processed further, I dropped the black mathemagic altogether and instead 
parse the seconds into two integer variables, one for seconds and one for 
milliseconds. The seconds go into the tm structure for mkgmtime, fix time is 
then returned as the result of mkgmtime, multiplied by 1000, plus milliseconds 
as previously parsed. Tests look OK.

The last remaining issue might be that we may receive GGA or GLL messages 
before the first RMC or ZDA, which will fill most fields with valid data before 
we have a timestamp. If that's a problem, I am sure to find that out during my 
efforts to fix #67 (as the planned app uses the timestamp of the first GPS 
fix). If this behavior is an issue, then the only solution is to ignore any 
messages (other than RMC and ZDA) until we have at least one valid timestamp.

As soon as I'm set up, I'll commit my changes.

Original comment by mich...@vonglasow.com on 7 Sep 2010 at 9:18

GoogleCodeExporter commented 9 years ago
Changes are committed to cupcake and master.

One issue remains, though it has no effect on execution (other than wasting a 
few CPU cycles): the utc_diff member of the NmeaReader structure.

Once needed to convert local time (obtained from UTC hhmmss) to UTC,  
gps_freerunner.c now does not use this member any more. Same about the 
nmea_reader_update_utc_diff() function, whose only purpose is to set utc_diff. 
They don't really do any harm, other than being a few superfluous lines of code 
which take up space and CPU cycles and may confuse the reader.

Can we remove the utc_diff member from the structure and the corresponding 
function from the code? Or is the NmeaReader structure exported to callers of 
libfreerunner_gps.so, some of which might get confused if we removed a member 
in the middle of the structure?

Original comment by mich...@vonglasow.com on 10 Sep 2010 at 10:10

GoogleCodeExporter commented 9 years ago
We may have another bug in the GPS layer somewhere...

Software used:
Cupcake 0.2.0 RC2
OSMTracker for Android 0.5.1
GPS Status

Other devices: 
Canon Digital Ixus 70 (digital camera)
GeeksPhone One (other Android phone)
Mitac Mio 168 RS (PDA with GPS)

When I go mapping, I first fire up GPS Status, get a fix, then take a photo of 
the display (showing GPS time) with my camera. Then I start a GPS track using 
OSM Tracker and at the same time take photos. 

At home, I georeference pictures using JOSM and the "drift" between the time 
shown on the GPS status display and the EXIF timestamp of the corresponding 
image.

If I don't touch the clock of the camera, it's normal for the drift to increase 
or decrease by, say, a second per week. This is the case for the Geeksphone as 
well as for the Mio. Also, the drift should be independent of the GPS device 
used.

With the Freerunner, however, I notice that the drift may differ by something 
in the 20-second range from the drift encountered with the Geeksphone. On 
another day, the drift is the same for both.

Also with the Freerunner, sometimes I sync the time as described above, and 
when I examine the photos, they appear somewhat further "down the road" than 
where I took them.

All of this seems like the GPS time reported by the Freerunner occasionally 
skips back (and forward again) by a few seconds.

I have attached an extract from a problematic track, together with screenshots 
from JOSM visualizing it: In the "dots" file, it looks just like a normal 
track. But in the "linked" file, you will see the line bouncing back and forth 
in a certain region - indicating that the points were not stored in the correct 
order.

Looking at the GPX file, you will see a 14-second range (from 6:43:40 to 
6:43:54) with *two* positions per second. (OSMTracker stores positions to a 
SQLite db first, then exports to GPX from that DB - most likely sorting points 
by their timestamp.)

This phenomenon can be explained by GPS time skipping back by 14 seconds (after 
6:43:54, time seems to have gone back to 6:43:40, causing the 14-second 
timespan to appear twice.)

Not sure what the issue is this time... it might be that we're flipping between 
GPS time and UTC - the offset currently is 15 seconds (since the introduction 
of the last leap second on Dec 31, 2008), which fits the pattern (after 
6:43:54, rather than switching to 6:43:55, the clock goes to 6:43:40, which is 
just that 15-second difference). I'll look into it, but any input is welcome...

Original comment by mich...@vonglasow.com on 1 Mar 2011 at 9:39

Attachments:

GoogleCodeExporter commented 9 years ago
Did some initial research, and the document at

http://gpsd.berlios.de/NMEA.txt

(in the Dates and times section) mentions that the current GPS/UTC time offset 
gets transmitted roughly one per 20 seconds, and a cold-booted GPS may have 
incorrect time until this information arrives.

Since UTC occasionally gets leap seconds, it is somewhat "slower" than GPS 
time, in other words, GPS time is ahead of UTC. So a receiver suddenly 
switching from GPS time to UTC would jump back - the very behavior we're seeing 
above. This would occur at most 20 seconds into the track. (My first trackpoint 
was at 6:41, two minutes before the pattern shown above occurred.)

So this seems to be the issue - we need to find out whether our GPS receiver is 
reporting UTC or GPS time and correct accordingly.

Original comment by mich...@vonglasow.com on 1 Mar 2011 at 10:02

GoogleCodeExporter commented 9 years ago
Digging more deeply into this... things may get ugly.

Apparently there is no NMEA message reporting the current difference between 
UTC/GPS time.

The documents on the NMEA protocol that I could find all talk of UTC 
timestamps, which I take to mean that the GPS receiver should correct for the 
drift internally and return neat UTC time stamps, such that the application 
would never see a "raw" GPS timestamp or need to deal with the offset - unless 
the receiver has just cold-booted, obtained a fix but has not yet seen the 
leap-second correction.

I have also looked at the documentation of the ANTARIS chip used in the 
Freerunner - no mention of leap second correction, not even in the proprietary 
messages.

The proprietary u-blox protocol provides a way to report back leap seconds and 
a flag showing if they are already known. If we can get the chip to talk binary 
to us (and process the information), we could examine the value and store the 
leap seconds, once we got them. On each subsequent startup, we could check 
again and manually "correct" timestamps with the previously saved leap seconds 
as long as the receiver does not have them. The main issue is to communicate in 
binary...

Original comment by mich...@vonglasow.com on 2 Mar 2011 at 12:28

GoogleCodeExporter commented 9 years ago
Had a look at gpsd, which features a full u-blox protocol driver. The GPS can 
be configured to output both UBX and NMEA at the same time. Since each format 
has its own sync characters to start each frame ($ for NMEA, 0xB5 0x62 for 
UBX), it's possible to tell which message format is coming down the wire. We 
just need to examine the NAV-TIMEGPS (0x01 0x20) message, which tells us 
whether the receiver already knows the leap seconds, and how many they are.

Plan:

- When we get the leap seconds for the first time, we write them to a file )and 
additionally keep them in memory) and set a flag.
- When the leap seconds change (because a new leap second was added), we update 
the file and memory information.
- On subsequent startups, when the leap seconds are not known (yet), we read 
information from the file (without setting the flag) and correct timestamps 
manually.
- As soon as we get the leap seconds again, we set the flag and stop applying 
the correction.

Thus the 15-second leap will manifest itself only once, when we start up the 
unit for the first time. If we start up the unit for the first time after a new 
leap second got added, we will get similar behavior (but the leap is smaller, 
equal to the number of leap seconds since the unit was last turned on). Also, 
if the receiver is on "during" a leap second, it may not get the update for up 
to 20 minutes and will then leap back by one second - this is a limitation of 
GPS itself.

Altogether, if you use your GPS regularly (at least once every 18 months or 
so), and if you leave it on for 20 minutes when using it for the very first 
time, this should make the problem near-negligible.

Original comment by mich...@vonglasow.com on 2 Mar 2011 at 8:35