Open-Telecoms-Data / open-fibre-data-standard

Open Fibre Data Standard
https://open-fibre-data-standard.readthedocs.io
Other
14 stars 3 forks source link

docs/guidance/publication.md: Add guidance on location obfuscation #263

Closed duncandewhurst closed 1 year ago

duncandewhurst commented 1 year ago

Related issues

Merge checklist

If there are changes to network-schema.json, network-package-schema.json, reference/publication_formats/json.md, reference/publication_formats/geojson.md or guidance/publication.md#how-to-publish-large-networks, update the relevant manually authored examples:

If you used a validation keyword, type or format that is not already used in the schema:

If you added a normative rule that is not encoded in JSON Schema:

If there are changes to examples/geojson/nodes.geojson or examples/geojson/spans.geojson, check and update the data use examples:

duncandewhurst commented 1 year ago

@stevesong please could you check that you're happy with the guidance added in this PR? It is linked from the decide what data to publish section of the publication process guidance.

stevesong commented 1 year ago

Do we want to make more explicit the distinction between what is captured in the standard and what is exported? Operators may choose to capture data in the standard at a high level of detail but export it for club or public use to different levels of precision. I feel like we should get that across in some form, lest operators default to less precision.

duncandewhurst commented 1 year ago

Since that idea of levels of sharing can apply to any field, I suggest that we add a paragraph to decide what data to publish (new content in italics). At the same time, we can fix the extra words in the final sentence:

Decide what data to publish

Bearing in mind your priority use cases, you ought to review the OFDS schema and decide which fields you want to publish.

OFDS is designed for the public disclosure of open data. However, you can also use it to structure data that you want to share only with specific partners and data that you want to keep within your own organisation. As such, this step can involve deciding which fields to make public, which to share with partners and which to keep private.

Most fields in the OFDS schema are optional. However, the more fields you publish, the more useful your data will be.

If you are concerned about disclosing the exact sensitive location data, see how to obfuscate location data.

We can then update the text under how to obfuscate location data as follows (new content in italics). At the same time we can correct the erroneous use of a normative keyword (should) on a non-normative documentation page that I introduced in this PR:

How to obfuscate location data

If you’re concerned about disclosing the exact location of fibre infrastructure, you can truncate the coordinates of node locations and span routes in your public or shared data to obfuscate their exact locations, whilst retaining the precise coordinates for use within your own organisation. Before truncating coordinates, you should ought to consider what level of accuracy is required to satisfy your priority use cases. You can use the following table as a guide to the relationship between coordinate precision and accuracy:

Does that sound good?

stevesong commented 1 year ago

That sounds reasonable @duncandewhurst