theodi / collaborative-data-patterns-catalogue

Website for the catalogue of service design patterns for collaboratively maintained data projects
Other
8 stars 1 forks source link

[PATTERN] Repeated Fields #12

Open ldodds opened 4 years ago

ldodds commented 4 years ago

Name

Repeated Fields (or perhaps something less generic that is more closely tied to the issue of imprecise data)

Problem

Fields in a record may need to have multiple values, not just to allow for repeating values but to allow the capturing of imprecise or uncertain data.

Context

The birth date of a historical figure, or data of construction of a building may be uncertain. Different sources of information may provide alternative evidence for that information. Forcing a single value for that information may be unhelpful or raise unnecessary conflict

Solution

Allow certain fields to have multiple values. Allow users to attach evidence, e.g. to Cite Sources, for individual values.

Discussion

Different authoritative sources may contradict each other, particularly in projects that deal with the political or historical. For example, two historical sources state different dates for Stalin’s birthday. In some cases, information is open to interpretation, such as translations.

Asking someone to choose one ‘truth’, where several may exist, can lead to disputes about edits, or could lead to someone feeling they cannot contribute at all, meaning their perspective and knowledge is lost.

Allowing multiple values for some fields allows these interpretations to be captured and reviewed. However there is a potential downside on data quality, at least from the perspective of those data users that expect an unambiguous value for all fields.

Clarifying how and where repeated fields are used should be part of your documentation

Related Patterns

Examples

ldodds commented 4 years ago

Note there is a potential related issue here that @rachelwilson and I have discussed which is allow multiple perspectives to coexist in a dataset. An example of that might be country boundaries or other data that may be disputed by different communities. A project might choose to adopt a specific stance, or allow multiple viewpoints to exist alongside one another (e.g. allowing multiple boundaries to exist for the same country).