w3c / sdw

Repository for the Spatial Data on the Web Working Group
https://www.w3.org/2020/sdw/
150 stars 81 forks source link

Spatial Data on the Web - Best Practice 2 - Value and approach for indexing individual Spatial Things #1085

Open MichaelGordon opened 5 years ago

MichaelGordon commented 5 years ago

In both the GNAF Best Practice implementation report and NRW Best Practice implementation report, an issue has been encountered in relation to Best Practice 2: Make your spatial data indexable by search engines namely that there are millions of Spatial Things in each dataset and whilst the implementations create machine readable and indexable data for each Spatial Thing, the use of pagination to make the Spatial Things navigable for humans seems to impact the indexability.

In discussions in Lyon there was also some questions about the demonstrating the value of having each individual Spatial Thing indexable - however I believe there was reasonable consensus that allowing users to find a specific Spatial Thing and for machines to be able to create links between datasets containing information about specific Spatial Things was seen as having clear value - although the description of the value of the linking between spatial things or data about the same spatial thing could be improved in Best Practice 2.

I believe @cportele is going to look into the sitemaps approached discussed in Best Practice 2. We may also benefit from discussion on this issue from Ed Parsons.

cportele commented 5 years ago

INSPIRE has held a Workshop on making spatial data discoverable through mainstream search engines.

The linked page contains the program and for each contribution the abstract and the slides. For the discussion-slots, the slides contain also a summary of the discussion. There are also slides capturing the final discussion (conclusion and next steps).

Here is a summary from my personal notes (I have excluded aspects that are related to the usual metadata/catalog issues):

The workshop gave a helpful overview and had good discussions. There weren't really surprising results, but there were new insights on details and there are several activities related to the topic - it would be useful to monitor their progress/results, could also be input to future updates of BP2.

Most of the following is already in BPs 1 and 2, but there are also some new aspects.

Stable identifiers:

Good web pages - which content?

Indexing by search engines:

marqh commented 5 years ago

Hello @cportele

I am interested in:

* Identifiers should be HTTPS URIs

and encouraging a bit more information/discussion about it, if I may. Is this ticket an appropriate place for this?

There is widespread use of HTTP identifiers. Many existing vocabularies are published as HTTP URIs

A cursory search reveals some fairly recent debate on the issue, such as

I wonder whether this issue has a degree of nuance which is not well covered by:

Identifiers should be HTTPS URIs

e.g.:

If the target for this statement is a Best Practice guide, I wonder whether some nuanced commentary would be really valuable?

thank you marqh

cportele commented 5 years ago

@marqh

Yes, there are nuances not expressed in the statement, this was just from my notes. Let me add the following: