historical-data / schema

Microdata schema for historical data.
historical-data.org
30 stars 4 forks source link

remove 'historical family' to reduce redundancy and complexity #7

Closed stoicflame closed 13 years ago

stoicflame commented 13 years ago

The schema definition for Historical Family is redundant; all the information can be contained in the definition for person and event. The proposal is to remove it.

ninjudd commented 13 years ago

There are cases where HistoricalFamily is necessary to describe relationships accurately. For example, if a father has multiple children from different wives, the only way to distinguish those children is to create separate HistoricalFamily records for each wife and connect them to the father through the family.

We should discuss whether this complexity is useful for the scope of historical-data.org though. It really depends on how accurately we want the microdata to reflect the underlying data.

stoicflame commented 13 years ago

We should discuss whether this complexity is useful for the scope of historical-data.org though. It really depends on how accurately we want the microdata to reflect the underlying data.

Well put.

ninjudd commented 13 years ago

If our main goal is to get rid of duplicate data, then using HistoricalFamily records is actually less redundant that direct relationships between HistoricalPerson records. For a family of 10, using HistoricalFamily would result in 10 edges, while using direct relationships between HistoricalPerson records would result in 90.

That said, we currently only add microdata for relationships from the focus profile on any given page, so we don't encounter this explosion of edges in practice.

RobertGardner commented 13 years ago

If our main goal is to get rid of duplicate data, then using HistoricalFamily records is actually less redundant

This was my concern when I read the proposal. Before eliminating HistoricalFamily I would like to see sample pages that demonstrate it's more compact and easier to parse/understand. Sometimes you need to denormalize to make it easier to get the information you need.

ninjudd commented 13 years ago

I thought some more about this. HistoricalFamily serves two more purposes that I can think of.

  1. Store information about marriage and divorce events without storing duplicate data on both spouses.
  2. Store adoption information without creating adopted_children, adopted_parents and adopted_siblings fields on HistoricalPerson.

Again though, the question is: What searches are people going to perform?

Do we want to enable people to search HistoricalFamily records for things like this?

stoicflame commented 13 years ago

Do we want to enable people to search HistoricalFamily records for things like this?

Those searches seem reasonable to me.

It's becoming clearer to me that that for this project, model flexibility takes precedence over model efficiency (i.e. elimination of redundancy). Is that an accurate statement?

stoicflame commented 13 years ago

Closing this issue; we'll keep HistoricalFamily.