steve8x8 / geotoad

Geocaching query tool written in Ruby
https://buymeacoffee.com/steve8x8
Other
28 stars 8 forks source link

gpx file broken - by "Emoji" #266

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
[Please include the relevant parts of your command line, if applicable.
Don't send your password though!]
1.Searching for caches within distance, output set to gpx
2.transfer to etrex20
3.not all caches will be desplayed
4.transfer to Garmin basecamp
5.Garmin BC reports broken file

What is the expected output? What do you see instead?
[If possible, include the last ~10--20 lines of verbose output.]

What version of the product are you using? On what operating system?
[Did you check you're using the latest version?]
Using latest featured version on WinXP with Garmin Basecamp 4.1.2 and connected 
etrex20
Step back to version 3.15 solved the problem

Please provide any additional information below.

Original issue reported on code.google.com by dirk.spi...@gmx.de on 6 Apr 2013 at 2:42

GoogleCodeExporter commented 9 years ago
Unable to reproduce.

- No idea of the command line you used.
- Which caches are missing? Only higher cache IDs?
- There is no single version 3.15 - moreover, 3.15 is a completely different 
thread.
- Have you passed the GPX files through gpsbabel for verification? If an error 
is found, which WID is affected? Can you reproduce the problem with a single 
WID query?
- What are the differences between the GPX files written by the two GeoToad 
versions?

Original comment by Steve8x8 on 8 Apr 2013 at 8:54

GoogleCodeExporter commented 9 years ago
"gpx file broken"
same problem after updating to 3.16.5
using the same settings as before
gpx-file don't work on EasyGPS as before.

Original comment by MrNorm...@gmail.com on 10 Apr 2013 at 9:14

GoogleCodeExporter commented 9 years ago
Still unable to reproduce, with the information (or rather, lack thereof) you 
are willing to give. Both the error, and what you've been doing.
If you were me, what would you do next?

Original comment by Steve8x8 on 11 Apr 2013 at 2:02

GoogleCodeExporter commented 9 years ago
Sorry for late reply
I will give more informations upcoming weekend.
Maybe I can send the gpx-files

Original comment by dirk.spi...@gmx.de on 11 Apr 2013 at 5:15

GoogleCodeExporter commented 9 years ago
I attached the gpx-file generated with version 3.16.5 which does not work 
neither with Garmin Basecamp nor with GPS-Babel.
I did not use any commandline, I just started the geotoad program, choose the 
coordinates and distance.
BR
Dirk

Original comment by dirk.spi...@gmx.de on 13 Apr 2013 at 8:32

Attachments:

GoogleCodeExporter commented 9 years ago
and this is the file using geotoad 3.16.0 which is usable

Original comment by dirk.spi...@gmx.de on 13 Apr 2013 at 8:44

Attachments:

GoogleCodeExporter commented 9 years ago
Easy GPS 4.18 willl not open the file 5Days35mora.gpx created with Geotoad 
3.16.5

Faulttext:
The Open command could not be completed.  The file "5Days35mora.gpx" could not 
be opened.  This XML file contains one or more errors.
C:\Dokumente und Einstellungen\Martin\Desktop\Geotoad_caches\5Days35mora.gpx
      <groundspeak:text encoded="False">Under 

BUT EasyGPS 4.18 open the file "5Days35mora2.gpx" without any error

Original comment by MrNorm...@gmail.com on 15 Apr 2013 at 10:58

Attachments:

GoogleCodeExporter commented 9 years ago
5Days35mora2.gpx was created with geotoad 3.16.4

Original comment by MrNorm...@gmail.com on 15 Apr 2013 at 11:01

GoogleCodeExporter commented 9 years ago
Passing the files through gpsbabel reveals the basic cause of the problem:

5Days35mora2.gpx:  OK
5Days35mora.gpx: GPX: XML parse error at line 64 of '5Days35mora.gpx' : 
reference to invalid character number
      <groundspeak:text encoded="False">Under Sveriges dövas sportfiskeförbunds årsstämma då loggades den ��</groundspeak:text>
gt_Rehrstieg_52_21147_Hamburg-y5.0.gpx:  OK
gt_Rehrstieg_52_21147_Hamburg-y5.0.gpx.BAD: GPX: XML parse error at line 15330 
of 'gt_Rehrstieg_52_21147_Hamburg-y5.0.gpx.BAD' : reference to invalid 
character number
      <groundspeak:text encoded="False">TFTC��</groundspeak:text>

�Þxx; are "UTF-16 surrogates" for so-called "Emoji" characters - for details 
see Issue 262. UTF-16 support has been around since early March...

What surprises me is that Issue 262 has first been reported *after* the release 
of 3.16.5, and there's no obvious code change between 3.16.4 and 3.16.5 related 
to this. 
Also the huge difference in file sizes isn't explained... too bad there seems 
to be no "verbose" output.

I will try to reproduce the issue on my Linux machine (now that I have an idea 
of the parameters used - even with the TUI, there's a full command line being 
reported right before running the query - look for the line following the text 
"To use this query in the future, type:". Makes things a lot easier.) Since 
Windows in some contexts behaves differently, I may fail.

Can both of you in the meantime give 3.17.7 a go? It has some Emoji stuff in, 
and for Windows comes with a newer Ruby version.

Original comment by Steve8x8 on 17 Apr 2013 at 8:24

GoogleCodeExporter commented 9 years ago
A fix (same as for Issue 262 by the way) has been committted to trunk, and will 
be in the upcoming 3.16.6 release which is planned for next week.
As a temporary workaround, one may edit the GPX files affected, e.g. 
    sed -e 's~\Ø..;\&#x....;~(*)~ig' <old.gpx >new.gpx
    (tested with Linux, one might even use the -i flag to edit in-place)
to replace the offending entities with something innocent.

Original comment by Steve8x8 on 17 Apr 2013 at 10:25

GoogleCodeExporter commented 9 years ago
I'm seeing other encodings in my recent GPX files that make gpsbabel fail:

<groundspeak:text encoded="False">Gefunden bei Eiseskälte und trüber Sicht. 
��</groundspeak:text>

From: http://coord.info/GCRQD5

<groundspeak:text encoded="False">Sehr schöner Weg entlang der Weser, vielen 
Dank für den tollen Cache!��</groundspeak:text>

From: http://coord.info/GC3C3MZ

Look at the last two chars. They don't even render correctly in my Firefox/Win 
on the geocaching.com page.

Original comment by magic...@gmail.com on 17 Apr 2013 at 11:56

GoogleCodeExporter commented 9 years ago
Oh my goodness. If there are different representations of the same thing 
around, people will use all of them, and invent an additional one.
Of course, � is the decimal representation of � (and 56833 maps to xDE01 as 
well as 56841 -> xDE09) - I cannot view all of them, but their (3-byte) UTF-8 
counterparts are well defined. Nevertheless, there's a reason why the deemoji() 
function has a "soft" and a "hard" mode...

Can you check whether a single line just before "if soft" (in 
output.rb:deemoji()) fixes the problem for now? It's

    text.gsub!(/(\&#(\d+);)/) { ($2.to_i < 32768) ? $1 : ('&#x' + $2.to_i.to_s(16).upcase + ';') }

it's terrible, it's almost unreadable, but it seems to do the job...
Got to find a better place for those entitiy-to-entity conversions though.

Original comment by Steve8x8 on 17 Apr 2013 at 2:20

GoogleCodeExporter commented 9 years ago
Yepp, that line seems to work for now.

Original comment by magic...@gmail.com on 17 Apr 2013 at 5:14

GoogleCodeExporter commented 9 years ago
Thanks for confirmation. Your tests are greatly appreciated!

I found no better place for this fix, but using the momentum, I replaced the 
hex translations with a general formula (taken from 
http://www.unicode.org/faq/utf_bom.html). Rev 1318 should fix the whole 
issue(s, as Issue 266 will be merged with Issue 262).

Original comment by Steve8x8 on 18 Apr 2013 at 1:10

GoogleCodeExporter commented 9 years ago
Merging with Issue 262...

Original comment by Steve8x8 on 18 Apr 2013 at 1:11

GoogleCodeExporter commented 9 years ago
Large attachments removed

Original comment by Steve8x8 on 19 Sep 2013 at 11:02