GPSBabel / gpsbabel

GPSBabel: convert, manipulate, and transfer data from GPS programs or GPS receivers. Open Source and supported on MacOS, Windows, Linux, and more. Pointy clicky GUI or a command line version...
https://www.gpsbabel.org
GNU General Public License v2.0
473 stars 126 forks source link

gpx.h GEOTAG searches for elements that imply a gpx schema violation #1195

Closed tsteven4 closed 10 months ago

tsteven4 commented 10 months ago

Back in 2007 a couple of commits attempted to add different ways to find geocaches in the gpx reader:

  1. e8cc8f6f2 12/6/2007 /gpx/wpt/geocache added 'Ralf Dragon adds support for opencaching.de GPX reads'
  2. 0ac4917ea 2/25/2007 /gpx/wpt/extensions/cache added 'Allow groundspeak namespace in GPX 1.1 or 1.0 format'

Neither of these has a test case.

The current definition of GEOTAG is https://github.com/GPSBabel/gpsbabel/blob/1aa1554c03158d42346ae7adf600ede41a382a4f/gpx.h#L328-L331

{"/gpx/wpt/extensions/cache/" name, {type, true}}, \ implies a schema violation in either gpx 1.0 (extensions element not expected) or gpx 1.1 (cache element not expected).

{"/gpx/wpt/geocache/" name, {type, true}} /* opencaching.de */ implies a scheam violation of either gpx 1.0 (geocache element not expected) or gpx 1.1 (geocache element not expected).

The gpx 1.0 and 1.1 schema uses <xsd:any namespace="##other" ... The issue is the ##other restriction is violated.

Today opencaching.de uses gpx 1.0 with groundspeak:cache and oc:cache elements (xmlns:oc="https://github.com/opencaching/gpx-extension-v1") I don't have any recent examples of caches from geocaching.com so it is unclear what they are doing now. I don't know that any creator ever used the groundspeak namespace in gpx 1.1.

a test script, log and reference files: test_files.zip

robertlipe commented 10 months ago

I'm hopefully about to fall asleep, but...

It looks like it's possible to order up samples without licensing issues via something like: https://www.opencaching.de/search.php?showresult=1&expert=0&output=HTML&utf8=1&sort=byname&orderRatingFirst=0&f_userowner=0&f_userfound=0&f_inactive=1&f_disabled=0&f_ignored=1&f_otherPlatforms=0&f_geokrets=0&country=&language=&cachetype=&cachesize=&difficultymin=0&difficultymax=0&terrainmin=0&terrainmax=0&cache_attribs=&cache_attribs_not=7&distance=75&unit=km&ortplz=&waypoint=&searchto=searchbycoords&latNS=N&lat_h=00&lat_min=00.000&lonEW=E&lon_h=000&lon_min=00.000&submit_dist=Search

Groundspeak (the parent company of geocaching.com) remains firmly in GPX 1.0. (They were actually one of the original instigators that worked with Dan on this and I came in after they started sketching GPX.) They're the reason we've defaulted to 1.0 for so long, but it's probably time to default to GPX 1.1 unless we see 1.0 on input, essentially flipping our decision tree as of now. So if you start with multiple GPX 1.0 (w/ groundspeak extensions) you should end with GPX 1.0 with GS extension, but maybe unicsv -> GPX or whatever should go to GPX 1.1 these days. (ISTR either talking about this or meaning to talk to you about this recently and it's been pretty recent that I've flipped on this.)

There was a time in the mid 20x0's that the world hated Groundspeak and everyone started their own geocaching sites. Most are now forgotten. ( https://navicache.com/, opencaching.com, terrache.com,geocacheuk.com, etc.) It might be nice if we can snag difficulty, terrain, etc. out of an opencaching.de file so we can convert it to unicsv, for example, but if we don't and everything just dumps out to the passthrough extensions, I'm not sure that make me sad.

It's a market we've not heard a peep from since that commit.

The groundspeak extensions are against 1.0. I think that 1.1 + Groundspeak extensions is officially the null set. Maybe the .de camp modified the groundspeak extensions (there's a pretty obvious mechanical transformation possible) to fit the 1.1 accent; I don't know. I think that some of the rebels DID use a GPX 1.1 + "obvious" translations, but I don't know if any survived the "oh, crap, this isn't fun and we need money" phase of the splinter camps nor if we should care.

LMK the breadth of coverage we need from a contemporary GPX from Groundspeak and I can order one up, greeking the text and coords if necessary for license compliance. I could probably use my own placements or from those of my prolific placer friends and get around that issue if needed. We just need one or two of contemporary server vintage, right?

On Tue, Oct 24, 2023 at 7:39 AM tsteven4 @.***> wrote:

Back in 2007 a couple of commits attempted to add different ways to find geocaches in the gpx reader:

  1. e8cc8f6 https://github.com/GPSBabel/gpsbabel/commit/e8cc8f6f290870b82a7c6d4e2da51283c21bc6a4 12/6/2007 /gpx/wpt/geocache added 'Ralf Dragon adds support for opencaching.de GPX reads'
  2. 0ac4917 https://github.com/GPSBabel/gpsbabel/commit/0ac4917eaf9c29a1935d03e53aadc89dc8157004 2/25/2007 /gpx/wpt/extensions/cache added 'Allow groundspeak namespace in GPX 1.1 or 1.0 format'

Neither of these has a test case.

The current definition of GEOTAG is https://github.com/GPSBabel/gpsbabel/blob/1aa1554c03158d42346ae7adf600ede41a382a4f/gpx.h#L328-L331

{"/gpx/wpt/extensions/cache/" name, {type, true}}, \ implies a schema violation in either gpx 1.0 (extensions element not expected) or gpx 1.1 (cache element not expected).

{"/gpx/wpt/geocache/" name, {type, true}} / opencaching.de / implies a scheam violation of either gpx 1.0 (geocache element not expected) or gpx 1.1 (geocache element not expected).

The gpx 1.0 and 1.1 schema uses <xsd:any namespace="##other" ... The issue is the ##other restriction is violated.

Today opencaching.de uses gpx 1.0 with groundspeak:cache and oc:cache elements (xmlns:oc="https://github.com/opencaching/gpx-extension-v1") I don't have any recent examples of caches from geocaching.com so it is unclear what they are doing now. I don't know that any creator ever used the groundspeak namespace in gpx 1.1.

a test script, log and reference files: test_files.zip https://github.com/GPSBabel/gpsbabel/files/13114471/test_files.zip

— Reply to this email directly, view it on GitHub https://github.com/GPSBabel/gpsbabel/issues/1195, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCSD34JZZ2TWVDYTCGDN7TYA6ZJ7AVCNFSM6AAAAAA6NUFGDWVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE2TSMJWGYZTQMI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

tsteven4 commented 10 months ago

I was able to download a few recent gpx caches from opencaching.de without agreeing to anything. We may be able to use those as reference files.

If you can get a few current gpx caches from groundspeak that we can use as references that would be very good.

tsteven4 commented 10 months ago

It might be nice if we can snag difficulty, terrain, etc. out of an opencaching.de file so we can convert it to unicsv

Without any changes:

gpsbabel -i gpx -f OC15F8D_10.gpx -o unicsv -F -
No,Latitude,Longitude,Name,Description,Symbol,Date,Time,URL,GCID,Type,Container,Terrain,Difficulty,Archived,Available,Last Found,Placer,Placer ID,Hint
1,50.174620,8.687800,"OC15F8D","TB Hotel Storchenwiese","Geocache",2020/04/13,18:00:00,"https://www.opencaching.de/viewcache.php?cacheid=193085",193085,"Traditional Cache","Regular",1.5,1.5,"False","True","2022/12/03 17:00:00","Bullitt25",361790,"Andere Seite, Hecke unten große Dose."
tsteven4 commented 10 months ago

Regarding the groundspeak extension(s), adding support for gpx 1.1 requires multiple changes to the gpx reader hash table, there are a bunch of places the groundspeak namespace prefix is missing.

Regarding opencaching.de, elements geocache and hints don't seem to be listed in any xsd they use, and don't appear in samples I looked at today. They also say As far as we know, all Geocaching sites export their GPX files in Topografix 1.0 format. on https://github.com/opencaching/gpx-extension-v1/blob/master/all-these-namespaces.md.

@robertlipe if you can get some recent vintage samples from geocaching.com it would be good to

The way this is going I think we should drop support for gpx 1.1 and geocaches. I have my doubts that it every worked, and it doesn't work now even with valid manually converted gpx 1.1 caches. It is interesting that the comments in xsd files often cite a particular version of gpx, but I don't see any enforcement of that in the schemas (e.g. https://github.com/opencaching/gpx-extension-v1, which we don't support, and http://www.garmin.com/xmlschemas/GpxExtensions/v3, which we do)

GPSBabelDeveloper commented 10 months ago

As expected, you are wise.

I thought I typed a response to this earlier, but don't see it.

To my knowledge, the intersection of GPX 1.1 and Geocaching (sites that I care about that have made themselves known to us at all) is the null set. Geocaching.com uses GPX 1.0 and their extensions are GPX 1.0. Because the geocaching files are read by millions (?) of GPS receivers in the market that'll never get a firmware update - and because of past upheavals even when Groundspeak made 'safe' changes - they are extremely conservative to changing those files.

It's a sucker bet to wager what someone else will do, but given how their pages https://www.geocaching.com/software/default.aspx still speak highly of Palm/OS, Blackberry, and WinCE apps and knowing the huge mess they had on their hands in the past when they updated PQs - both when they weren't great with compatibility and when they did all the right things, but readers that were in device firmware couldn't cope - I just don't see them moving to GPX 1.1 because gains them nothing and costs them plenty because they can't ever really drop GPX 1.0. I remember they once added a new icon type (groundspeak:type) for an event that happened once a year and because old receivers handled it so badly they had to revert that change.

Trivia: at one point, there were three independent GPX writers at Groundspeak: one for pocket queries (the bulk "give me a thousand" mode), one for downloading the single page you're looking at, and one for the mobile-centric API that's quite garden walled.

That OC link is interesting. Their view of history is, well, it's one interpretation. :-) The Opencaching network has more traction in EU than it does in NA, but it's interesting that we just don't hear much about them - and we don't make their list of software, either. {shrug}

I just proposed https://github.com/GPSBabel/gpsbabel/pull/1196 (I'm not married to the name.)

Just skimming the diffs:

It swaps available and archived. The cache is no longer available, so this is intentional. It moves to "groundspeak 1.0.1" This makes more of the game attributes tri-states instead of bools. ( Groundspeak no longer relies on the implicit 'Z' that was the subject of consternation in GPX. (and the subject of an open bugreport for us) They no longer write sub-atomic precision for coordinates. HTTPS is just how the web is done these days. They've made changes to user-generated HTML in the tags (another tempest...) to reduce the amount of colloquial/broken HTML that the viewer (often a moderately underpowered ARM-class device) may have to parse and view. They're a Windows house (Seattle company....) so they have DOS carriage returns.

So I don't think Geocaching.com GPX files have -really- changed for us materially in a Very Long Time.

On Tue, Oct 24, 2023 at 3:53 PM tsteven4 @.***> wrote:

Regarding the groundspeak extension(s), adding support for gpx 1.1 requires multiple changes to the gpx reader hash table, there are a bunch of places the groundspeak namespace prefix is missing.

Regarding opencaching.de, elements geocache and hints don't seem to be listed in any xsd they use, and don't appear in samples I looked at today. They also say As far as we know, all Geocaching sites export their GPX files in Topografix 1.0 format. on https://github.com/opencaching/gpx-extension-v1/blob/master/all-these-namespaces.md .

@robertlipe https://github.com/robertlipe if you can get some recent vintage samples from geocaching.com it would be good to

  • verify they still use gpx 1.0
  • if possible, add a few as references, perhaps replacing our ancient references (that have been manually altered).

The way this is going I think we should drop support for gpx 1.1 and geocaches. I have my doubts that it every worked, and it doesn't work now even with valid manually converted gpx 1.1 caches. It is interesting that the comments in xsd files often cite a particular version of gpx, but I don't see any enforcement of that in the schemas (e.g. https://github.com/opencaching/gpx-extension-v1, which we don't support, and http://www.garmin.com/xmlschemas/GpxExtensions/v3, which we do)

— Reply to this email directly, view it on GitHub https://github.com/GPSBabel/gpsbabel/issues/1195#issuecomment-1778020042, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC3VADZJR4LAVE7YPAR6JPLYBATD5AVCNFSM6AAAAAA6NUFGDWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZYGAZDAMBUGI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

tsteven4 commented 10 months ago

Should we just replace GCGCA8.gpx with the new version, and update the 6 changed output reference files?

robertlipe commented 10 months ago

I'll take it.

Should I minimize deltas (e.g. flip archived and available, roll back to logs) to reflect it as it would have been in the original date,. But written by a modern writer or just enshrine this for the next 20 years.

The other pR that was closed - I don't think it's a coincidence that unused the same HTML tidy as the site itself did.

On Wed, Oct 25, 2023, 9:54 AM tsteven4 @.***> wrote:

Should we just replace GCGCA8.gpx with the new version, and update the 6 changed output reference files?

— Reply to this email directly, view it on GitHub https://github.com/GPSBabel/gpsbabel/issues/1195#issuecomment-1779463203, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCSD3542IW3PO5PF6SPFALYBER3XAVCNFSM6AAAAAA6NUFGDWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZZGQ3DGMRQGM . You are receiving this because you were mentioned.Message ID: @.***>

tsteven4 commented 10 months ago

I found a source for xsd files that may be related to issue 1 above. I am not sure any of these schema are in use any more. They don't appear to be used by geocaching.com.au now. For example, a randomly selected file from https://geocaching.com.au/cache/tp13351.gpx uses gpx 1.0 and groundspeak cache 1.0.1

https://geocaching.com.au/geocache/ https://geocaching.com.au/geocache/1/0/1/geocache.xsd uses the top level element name "geocache". https://geocaching.com.au/geocache/1/1/geocache.xsd uses the top level element name "cache". https://geocaching.com.au/geocache/1/geocache.xsd uses the top level element name "geocache".

In any event, this doesn't change my original point about implying a schema violation. But, the illegal geocache element without a namespace declaration does get parsed today, where a legal one with a namespace declaration does not.

GPSBabelDeveloper commented 10 months ago

It's possible that all (most of?) these offshoot sites decided to try to quit wagging the dog and get over their ideological objection to using groundspeak: so that software could actually use their files.

If the proposal is to replace

define GEOTAG(type,name) \

{"/gpx/wpt/groundspeak:cache/groundspeak:" name, {type, true}}, \ {"/gpx/wpt/extensions/cache/" name, {type, true}}, \ {"/gpx/wpt/geocache/" name, {type, true}} / opencaching.de / with

define GEOTAG(type,name) \

{"/gpx/wpt/groundspeak:cache/groundspeak:" name, {type, true}}

and accept only groundspeak: extensions and thus files that are GPX 1.0 with Groundspeak Geocaching extensions 1.0.1 or 1.0.0, consider it agreed upon. That passes testo, but we don't claim to have samples for all the geocaching spinoffs.

On Wed, Oct 25, 2023 at 1:32 PM tsteven4 @.***> wrote:

I found a source for xsd files that may be related to issue 1 above. I am not sure any of these schema are in use any more. They don't appear to be used by geocaching.com.au now. For example, a randomly selected file from https://geocaching.com.au/cache/tp13351.gpx uses gpx 1.0 and groundspeak cache 1.0.1

https://geocaching.com.au/geocache/ https://geocaching.com.au/geocache/1/0/1/geocache.xsd uses the top level element name "geocache". https://geocaching.com.au/geocache/1/1/geocache.xsd uses the top level element name "cache". https://geocaching.com.au/geocache/1/geocache.xsd uses the top level element name "geocache".

In any event, this doesn't change my original point about implying a schema violation. But, the illegal geocache element without a namespace declaration does get parsed today, where a legal one with a namespace declaration does not.

— Reply to this email directly, view it on GitHub https://github.com/GPSBabel/gpsbabel/issues/1195#issuecomment-1779834345, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC3VAD6NL2O3U65DYAIVDQ3YBFLNFAVCNFSM6AAAAAA6NUFGDWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZZHAZTIMZUGU . You are receiving this because you commented.Message ID: @.***>