mendhak / gpslogger

:satellite: Lightweight GPS Logging Application For Android.
https://gpslogger.app
Other
1.97k stars 606 forks source link

Corrupt KML files #575

Closed danielfaust closed 7 years ago

danielfaust commented 7 years ago

I don't know what's wrong here, but all my KML exports starting from gps--2017-06-19--mon--10-07--88 onwards are corrupted, that file included. File 2017-06-18--sun--20-52--88 is working.

The file from Monday for example ends in

</Placemark></Document></kml>
ent></kml>

The next file from Tuesday contains this segment

<when>2017-06-20T19:15:54.000Z</when>
<gx:coord>0.0000 0.0000 479.0</gx:coord>
</
<when>2017-06-20T19:15:57.000Z</when>
<gx:coord>0.0000 0.0000 481.0</gx:coord>

That's about 49 files that are corrupted, so this is very consistent. From that day on every file. Just noticed it while trying to import one into a Mock Provider.

Either my flash storage or memory is getting corrupted, or there is a bug in the code. I checked for relevant commits, but couldn't find any.

Can someone possibly confirm this?

mendhak commented 7 years ago

I don't think your storage is corrupted. When you see this problem are they usually files with annotations?

Also it might depend on how you're grabbing files off the phone; a file copied via USB may not be latest because of the way Android Media Scanner works. You can try a local in-phone viewer (I use QuickEdit) versus wherever it's being uploaded to.

danielfaust commented 7 years ago

Yes, all files from the 19th on contain an annotation (kml files only contain the inserted last annotation, maybe this could get improved). Files prior to the 19th don't contain annotations. On the 18th I made a Tasker task which automatically inserts geofencing information via annotations.

Here is an example of the first line which contains an annotation. The problem occurs later in the file, never in this first line.

<?xml version="1.0" encoding="UTF-8"?><kml xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2" xmlns:kml="http://www.opengis.net/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom"><Document><name>2017-07-30T07:51:49.986Z</name><Placemark><name>Home - xx.latxxxxxxxxxx - xx.lngxxxxxxxxxx - 35.0 - 2017-07-30T10:30:20.865+0200 - Inside</name><Point><coordinates>xx.latxxxxx,xx.lngxxxxx,479.0</coordinates></Point></Placemark>

It is not the storage nor the transfer. It only affects the KML files. I copied them via USB, via integrated FTP and via FolderSync (FTP). Always together with the CSV file and a JSON and OGG file which I first move into the GPSLogger folder before syncing. All files are OK but the KML one.

I looked at the code in the repo, and saw that the KML file-writer opens the file, seeks to some point and inserts the new data. Possibly something is wrong with the seek offset or something like that.

I didn't search for the error in an on-device text editor, because I noticed the problem when I tried to import the file into that GPS Mock Provider. It wasn't able to import that file.

Here is the ending of a file

<when>2017-08-07T18:35:34.000Z</when>
<gx:coord>redacted</gx:coord>

<when>2017-08-07T18:35:36.000Z</when>
<gx:coord>redacted</gx:coord>

<when>2017-08-07T18:35:38.000Z</when>
<gx:coord>redacted</gx:coord>

<when>2017-08-07T18:35:39.000Z</when>
<gx:coord>redacted</gx:coord>
</gx:Tr                                                      <-- Error
<when>2017-08-07T18:35:40.000Z</when>
<gx:coord>redacted</gx:coord>

<when>2017-08-07T18:35:42.000Z</when>
<gx:coord>redacted</gx:coord>

<when>2017-08-07T18:35:44.000Z</when>
<gx:coord>redacted</gx:coord>

<when>2017-08-07T18:35:46.000Z</when>
<gx:coord>redacted</gx:coord>

<when>2017-08-07T18:35:48.000Z</when>
<gx:coord>redacted</gx:coord>
</gx:Track>
</Placemark></Document></kml>

Considering that the timestamps increase correctly (I'm sampling at 1 second), you can assume that no data is lost

<when>2017-08-07T18:35:38.000Z</when>
<when>2017-08-07T18:35:39.000Z</when>
here
<when>2017-08-07T18:35:40.000Z</when>

</gx:Tr is part of the tag which occurs at the end of the file.

Consider investigating what happens when first a long annotation is inserted which then gets "replaced" by a shorter one.

I will deactivate the automatic annotations today to see if the file still gets corrupted.

danielfaust commented 7 years ago

I recorded a ride without the annotations, and the file is not corrupt, for the first time since 2017-06-18. Very, very likely an annotation issue.

mendhak commented 7 years ago

Ah finally I found it. I think I also noticed that annotations were being replaced. Sorry for this it looks like you have lost a lot of annotations. KML files annotations are the hardest because the annotation needs to be added near the beginning of the file whereas with the other formats it's right at the end. I've put in a new way of writing annotations but it has to be via a tmp file. Not ideal but it should be cleaner.

Here is an APK if you want to test it. I'll also push the commit.

APK Signature Checksum

danielfaust commented 7 years ago

When I started using GPSLogger I compared all the file formats. I noticed the "only last annotation is saved in the KML file"-problem back then, I didn't think that it was a bug, just not fully implemented, and since then only generated that file to drop it into Google Earth, but the important file to me was the CSV. I have all the annotations in there, so no worries.

The temp file approach is a bit unexpected.

Why don't you store all events into a database and then give the loggers access to it for exporting it upon stop-recording, instead of writing directly into the files? That could also be more energy efficient during recording. Somewhere (Play Store?) I read that there is a problem with the NMEA logger that it starts to get really sluggish after recording for a long time, so maybe that logger has to read the entire file each time (I don't know about FileWriter internals in append mode). This would also allow for logging more than one annotation in one location fetch.

I'm logging about 4 samples per second (1xGPS, 3xBluetooth-Devices) into a database and a one hour ride has absolutely no issues with this. I'm storing the data as a JSON string (each sample is aprox a 110 chars JSON string with lots of metadata, like source, timestamp, timezone offset, each sample is one row in a table with cols "session-id", "timestamp", "type", "json", where type is something like "gps", "ble", "inf", and "inf" refers to json events containing stuff like "start" "stop" "annotation" "ble-device-(dis)?connected") in the db, because JSON is ultimately the target format, but GPSLogger could do this much, much more efficiently because it knows exactly which fields will get stored. An exported JSON containing all the logged data contains about 12000 items resulting in an 1,3 MB file. To put that into perspective I'm recording an OGG file of the ride that ends up being 28MB big. Battery goes from 100% to 70% in that hour, also all this while attempting to keep a websocket connection open to my server and trying to push the Bluetooth-Device (Heart Rate sensor, Cadence and Speed sensor, Temp+Humidity+Brightness-Sensor) data to the server (not the GPS data) over this link. GPSLogger+MyApp(GPS+BLE+WebSocket)+Audio.

GPSLogger is kind of a backup for me now in case my app crashes, so that I have at least some data.

When I record only with GPSLogger, the battery goes from 100% to 85-80% (1 second interval, storing into CSV and kml). My app then also keeps a websocket connection, but tries to push data over it only once a minute vs 3 times a second when streaming bluetooth data (which I'm doing only for experimenting).

I am really satisfied with the database performance. With the great architecture GPSLogger has (EventBus, I didn't know it existed before I looked at GPSLoggers code), it would be trivial to start implementing this, experiment with it and replace it once all is done.

The only drawback is that you actually have to start an export in order to generate the files, so you can't just pull a copy a csv file while still recording. But on the other hand, you can start an export while recording, and not delete the data from the database until the recording actually stops.

jesperdj commented 7 years ago

I have the same bug occurring on my Nexus 6P.

I log in two formats: gpx and kml. During the day I'm travelling and sometimes I add a placemark. In the evening I upload my gpx and kml files to Google Drive and then I notice that my kml file is corrupt in exactly the same way as danielfaust describes: there are wrong, half XML tags somewhere in the middle of the file.

I hope you can release a fix for this soon.

mendhak commented 7 years ago

v89 released to market should appear in a few hours.

Yes I've thought of the DB aspect before - it does allow for extra new features too but I don't want to go down that route. Introducing it still won't help much with battery saving or rather, that comparison goes away, since the app will have to be frequently exporting to file anyway to cater to uploaders and people who sync the folder themselves - so now the app is writing to DB and to disk. I don't think of it from file point of view, it's more of a release point of view. It makes upgrades and rollbacks much more difficult especially with schema changes.

Except for KML, the other file writes tend to be pretty normal - the code inserts into a specific position and writes the file closing strings and they work well with or without annotations. However KMLs are a bit... special... and the resulting structure isn't conducive to annotations.

Anyway I'm closing this now but reopen it if the KML problem is still around

mendhak commented 7 years ago

--- Closing as part of issues cleanup, market release, inactivity or inevitability. Issue can be reopened if needed, or a new issue created.