w3c / sdw

Repository for the Spatial Data on the Web Working Group
https://www.w3.org/2020/sdw/
148 stars 81 forks source link

Should we reference the paradox of the Ship of Theseus to highlight there is no rigorous notion of persistent identity? #194

Closed 6a6d74 closed 7 years ago

6a6d74 commented 8 years ago

Relating to Best Practice 4: Provide stable identifiers for Things (resources) that change over time, Kerry notes in her email:

I wonder whether some reference to the paradox of the Ship of Theseus would be useful here just to highlight the fact that there is no really rigorous notion of persistent identity.

PeterParslow commented 8 years ago

And more subtly, different communities will have different criteria to determine whether a 'thing' is still the same thing, or has changed 'in its essence' - I offer my 2007 example, originally for the IHO: "Lifecycles of a drilling platform" https://docs.google.com/document/d/1Y_-Ens_41Go3T_9VRsPJleiO_g02ABijMLDhTOjFGhA/edit?usp=sharing

liamquin commented 8 years ago

On Wed, 2016-01-06 at 05:45 -0800, Jeremy Tandy wrote:

Relating to [Best Practice 4: Provide stable identifiers for Things (resources) that change over time][bp4], Kerry notes in her [email][1]:

I wonder whether some reference to the paradox of the [Ship of Theseus][2] would be useful here just to highlight the fact that there is no really rigorous  notion of persistent identity.

You cannot prove the absence of a thing by reference to a counter- example. A glass of milk does not prove there is no such thing as beer. That a glass of milk cannot be preserved in the sunlight for more than a few hours does not prove there is no refrigeration.

Likewise, you cannot prove the absence of a notion or concept, since in the process of defining it sufficiently to show it does not exist you have created it. You can only show that you yourself are not able to define it, and not that no-one else can.

Having said that, I agree that "identity" is a particularly difficult concept and on the whole I side with Heraclitus, who would surely have said that it was not in fact the same Theseus, nor the same ship (we all agree it was not the same sail).

So it's perhaps better to say, "It is difficlt to determine when two things are the same in all respects - we have to define exactly what we mean by same (identity), and to understand that sameness really only makes sense for a particular set of measurements or properties in a particular context."

Liam

Liam R. E. Quin liam@w3.org The World Wide Web Consortium (W3C)

dr-shorthair commented 8 years ago

Do we add the 'Rig of Parslow' to the 'Ship of Theseus' in our mythology?

dr-shorthair commented 8 years ago

Ø It is difficlt to determine when two things are the same in all respects

But that is overshooting. What matters is if they are the same for the purpose under consideration.

PeterParslow commented 8 years ago

Hence the conclusion in the UK's Digital National Framework that the essential policy is that data publishers explicitly publish their life cycle rules. 'Consumers' i.e. people creating links to those object can then decide whether to link to the object's persistent identifier, to a particular version of it (& perhaps get notified when new versions are created), or not to link to it at all - it's too different a thing for the purpose in hand. A data publisher should not attempt to guess all the purposes for which someone might like to use/reference their data. But someone creating a link should do so thoughtfully & with relevant information to hand. (Unless, I suppose, their particular use has a low quality threshold)

dr-shorthair commented 8 years ago

+1

6a6d74 commented 7 years ago

@PeterParslow - it's been a long time since you posted your feedback. It may be dusty, but it's not forgotten. Having read through your DNF document about the lifecycle of a drilling rig, I think that I've captured the concerns in rewritten sections of the BP doc:

13.3 Spatial data versioning

When dealing with change to a spatial thing, you should consider its lifecycle. For example, the extent of Derwent Water in the UK’s Lake District (http://data.os.uk/id/50kGazetteer/72167) may have varied as it is surveyed over the years, but it is still the same lake.

In contrast, Monmouthshire, UK, (http://data.os.uk/id/7000000000025489) provides a counter example. Although the historic county of Monmouthshire (http://dbpedia.org/resource/Monmouthshire_(historic)) was formed back in 1535, what we currently refer to as Monmouthshire is a unitary authority that was created in 1996 by combining the districts of Monmouth and Llanelly. While there is a long-term historical precedent for a place of that name, the historic county and unitary authority on Monmouthshire and the district of Monmouth are all different spatial things. This is most obvious from looking at terms used to describe each of them: historic county, unitary authority and district. Each of these spatial things should have its own HTTP URI and even though the historic county is no longer used, it is important that we can still refer to it.

Essentially, the decision to assign a new identifier in response to change is often a data modelling choice. [DWBP] section 8.9 Data Vocabularies and section 13.5 Spatial Data Vocabularies provide further guidance on the topic of data modelling; determining which concepts and relationships should be used to describe your area of interest.

Best Practice 6: How to describe properties that change over time

Before deciding which approach to use, data publishers must decide how much change is acceptable before a spatial thing can no longer considered as the same resource. Consider Eddystone Lighthouse for example: the “Eddystone Light” has existed in (more or less) the same place on Eddystone Rocks since 1698. A single HTTP URI (http://dbpedia.org/resource/Eddystone_Lighthouse) is used to identify “the lighthouse on Eddystone rocks” for all that period. The lighthouse's characteristics have changed over that period and could be captured as snapshots. However, each of the four structures that have stood on that site, from Winstanley's Eddystone Lighthouse (the first incarnation) to Douglass' Eddystone Lighthouse (the 4th and current incarnation) are different spatial things. Incremental change for these structures during the entire period from 1698 is not appropriate; one structure replaces another and so each structure should be assigned a unique identifier.

Best Practice 7: Use globally unique persistent HTTP URIs for spatial things

When reusing authoritative URIs, you must be sure that it identifies the same phenomenon as your subject of interest. For example, it may seem sensible for the Amsterdam Fire Department to reuse the Geonames identifier http://sws.geonames.org/6618987 or Freebase identifier https://g.co/kg/m/02s5hd (now part of Google’s Knowledge Graph) to identify a fire at Anne Frank’s House. However, there is a mismatch between the type of spatial thing being identified: one is a museum or place of cultural heritage, the other is a fire incident. If in doubt, you should create, or mint, your own identifier.

Are we getting there?

PeterParslow commented 7 years ago

Jeremy, Thanks for following up on this, and I believe you’ve captured some of what I was getting at. However, there are two points below where I differ – one I think can be resolved, the other I suspect cannot.

Firstly, you say “the decision to assign a new identifier in response to change is often a data modelling choice”. Perhaps this is terminology regarding roles, but in my opinion, it is the domain expert (not the data modeller) who decides which characteristics of the spatial thing are essential (i.e. change to them would cause change of identity), and which characteristics are not (i.e. change to the characteristic would not generally imply change of identity). The data modeller will probably have to coach the domain expert through the process of deciding, as part of determining the characteristics that will form part of the model (this I guess is harder in an open system, where characteristics can be added later).

(secondly) Example: Eddystone lighthouse: from the domain of marine navigation, it is the purpose of the light which is paramount – not all changes to the nature of the light would mean a new identity – as the rules for lights have changed, the physical characteristics used to indicate that the light marks a particular kind of hazard have changed.

It’s this “different things to different people” aspect which I believe you capture with your final paragraph about being “sure that it identifies the same phenomenon as your subject of interest”. That was the DNF case for requiring data publishers to include life cycle rules as part of their published ‘feature catalogue’ – detailed description of the dataset / spatial things.

Peter (I notice the email addresses mostly say “noreply” – I’ll see what happens!)

From: Jeremy Tandy [mailto:notifications@github.com] Sent: 13 December 2016 15:28 To: w3c/sdw sdw@noreply.github.com Cc: Peter Parslow Peter.Parslow@os.uk; Mention mention@noreply.github.com Subject: Re: [w3c/sdw] Should we reference the paradox of the Ship of Theseus to highlight there is no rigorous notion of persistent identity? (#194)

@PeterParslowhttps://github.com/PeterParslow - it's been a long time since you posted your feedback. It may be dusty, but it's not forgotten. Having read through your DNF document about the lifecycle of a drilling rig, I think that I've captured the concerns in rewritten sections of the BP doc:

13.3 Spatial data versioninghttp://w3c.github.io/sdw/bp/#bp-dataversioning

When dealing with change to a spatial thing, you should consider its lifecycle. For example, the extent of Derwent Water in the UK’s Lake District (http://data.os.uk/id/50kGazetteer/72167) may have varied as it is surveyed over the years, but it is still the same lake.

In contrast, Monmouthshire, UK, (http://data.os.uk/id/7000000000025489) provides a counter example. Although the historic county of Monmouthshire (http://dbpedia.org/resource/Monmouthshire_(historic)) was formed back in 1535, what we currently refer to as Monmouthshire is a unitary authority that was created in 1996 by combining the districts of Monmouth and Llanelly. While there is a long-term historical precedent for a place of that name, the historic county and unitary authority on Monmouthshire and the district of Monmouth are all different spatial things. This is most obvious from looking at terms used to describe each of them: historic county, unitary authority and district. Each of these spatial things should have its own HTTP URI and even though the historic county is no longer used, it is important that we can still refer to it.

Essentially, the decision to assign a new identifier in response to change is often a data modelling choice. [DWBP] section 8.9 Data Vocabularies and section 13.5 Spatial Data Vocabularies provide further guidance on the topic of data modelling; determining which concepts and relationships should be used to describe your area of interest.

Best Practice 6: How to describe properties that change over timehttp://w3c.github.io/sdw/bp/#desc-changing-properties

Before deciding which approach to use, data publishers must decide how much change is acceptable before a spatial thing can no longer considered as the same resource. Consider Eddystone Lighthouse for example: the “Eddystone Light” has existed in (more or less) the same place on Eddystone Rocks since 1698. A single HTTP URI (http://dbpedia.org/resource/Eddystone_Lighthouse) is used to identify “the lighthouse on Eddystone rocks” for all that period. The lighthouse's characteristics have changed over that period and could be captured as snapshots. However, each of the four structures that have stood on that site, from Winstanley's Eddystone Lighthouse (the first incarnation) to Douglass' Eddystone Lighthouse (the 4th and current incarnation) are different spatial things. Incremental change for these structures during the entire period from 1698 is not appropriate; one structure replaces another and so each structure should be assigned a unique identifier.

Best Practice 7: Use globally unique persistent HTTP URIs for spatial thingshttp://w3c.github.io/sdw/bp/#globally-unique-ids

When reusing authoritative URIs, you must be sure that it identifies the same phenomenon as your subject of interest. For example, it may seem sensible for the Amsterdam Fire Department to reuse the Geonames identifier http://sws.geonames.org/6618987 or Freebase identifier https://g.co/kg/m/02s5hd (now part of Google’s Knowledge Graph) to identify a fire at Anne Frank’s House. However, there is a mismatch between the type of spatial thing being identified: one is a museum or place of cultural heritage, the other is a fire incident. If in doubt, you should create, or mint, your own identifier.

Are we getting there?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/w3c/sdw/issues/194#issuecomment-266768365, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AIGUoT8PaaVxGCg-qfrBtJ8FEnEIR-aLks5rHrmZgaJpZM4G_mYI.

This email is only intended for the person to whom it is addressed and may contain confidential information. If you have received this email in error, please notify the sender and delete this email which must not be copied, distributed or disclosed to any other person.

Unless stated otherwise, the contents of this email are personal to the writer and do not represent the official view of Ordnance Survey. Nor can any contract be formed on Ordnance Survey's behalf via email. We reserve the right to monitor emails and attachments without prior notice.

Thank you for your cooperation.

Ordnance Survey Limited (Company Registration number 09121572) Registered Office: Explorer House Adanac Drive Southampton SO16 0AS Tel: 03456 050505 http://www.os.uk

6a6d74 commented 7 years ago

Thanks @PeterParslow ... I'll try to incorporate those clarifications in the next release of the BP doc (the WG should cite to release another draft tomorrow; we're next scheduled to release at time end of Jan after that - which should include this update)

6a6d74 commented 7 years ago

Hi @PeterParslow ... me again :)

Responding to your earlier comments (21 Jan 2016 and 14 Dec 2016), I've modified this statement:

Essentially, the decision to assign a new identifier in response to change is often a data modelling choice.

So that it now reads:

Essentially, the decision to assign a new identifier in response to change depends on how domain experts think about the lifecycle of the spatial thing, which then manifests in a data modelling choice.

... and included:

Data publishers should not attempt to guess all the purposes for which someone might use or reference their data - ending up with a super-complex data model that tries to cover every possible use case. Instead, data publishers should try to help data consumers make informed decisions about the best way to use the data by providing good metadata. When it comes to spatial things, or any resource, that changes over time, it is important to provide metadata about the life cycle of those entities and the resources used to describe them. Given that information, data consumers can make considered choices about which resource they want to link to.

Hopefully that covers all your concerns. A vote to release a new Working Draft is scheduled for Wednesday 8 Feb, and should be published soon after.

I'm closing this ISSUE. If you're not happy, please feel free to reopen.

PeterParslow commented 7 years ago

Looks good to me – thanks / glad to be useful

From: Jeremy Tandy [mailto:notifications@github.com] Sent: 03 February 2017 14:38 To: w3c/sdw sdw@noreply.github.com Cc: Peter Parslow Peter.Parslow@os.uk; Mention mention@noreply.github.com Subject: Re: [w3c/sdw] Should we reference the paradox of the Ship of Theseus to highlight there is no rigorous notion of persistent identity? (#194)

Hi @PeterParslowhttps://github.com/PeterParslow ... me again :)

Responding to your earlier comments (21 Jan 2016https://github.com/w3c/sdw/issues/194#issuecomment-173520942 and 14 Dec 2016https://github.com/w3c/sdw/issues/194#issuecomment-266989061), I've modified this statement:

Essentially, the decision to assign a new identifier in response to change is often a data modelling choice.

So that it now reads:

Essentially, the decision to assign a new identifier in response to change depends on how domain experts think about the lifecycle of the spatial thing, which then manifests in a data modelling choice.

... and included:

Data publishers should not attempt to guess all the purposes for which someone might use or reference their data - ending up with a super-complex data model that tries to cover every possible use case. Instead, data publishers should try to help data consumers make informed decisions about the best way to use the data by providing good metadata. When it comes to spatial things, or any resource, that changes over time, it is important to provide metadata about the life cycle of those entities and the resources used to describe them. Given that information, data consumers can make considered choices about which resource they want to link to.

Hopefully that covers all your concerns. A vote to release a new Working Draft is scheduled for Wednesday 8 Feb, and should be published soon after.

I'm closing this ISSUE. If you're not happy, please feel free to reopen.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/w3c/sdw/issues/194#issuecomment-277262186, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AIGUoRlhHg3OeVDHddLQUzSO9rmn7STbks5rYzvggaJpZM4G_mYI.

This email is only intended for the person to whom it is addressed and may contain confidential information. If you have received this email in error, please notify the sender and delete this email which must not be copied, distributed or disclosed to any other person.

Unless stated otherwise, the contents of this email are personal to the writer and do not represent the official view of Ordnance Survey. Nor can any contract be formed on Ordnance Survey's behalf via email. We reserve the right to monitor emails and attachments without prior notice.

Thank you for your cooperation.

Ordnance Survey Limited (Company Registration number 09121572) Registered Office: Explorer House Adanac Drive Southampton SO16 0AS Tel: 03456 050505 http://www.os.uk

6a6d74 commented 7 years ago

@PeterParslow - thank you for confirmation.