FamilySearch / gedcomx

An open data model and an open serialization format for exchanging genealogical data.
http://www.gedcomx.org
Apache License 2.0
355 stars 67 forks source link

Add Place as attribute of RecordCollection #59

Closed ianstiles closed 13 years ago

ianstiles commented 13 years ago

Most collections are scoped within a country, or even a state or county, such that the indexing process doesn't include the country, state, county because it is assumed to be obvious. Going forward they will be including this in the data, but we have a lot of old records to convert.

To help with this process we propose the following:

Make RecordCollection have an optional Place member, (same object that Event has). The member should be optional for those collections that are scoped "World". For those collections that are scoped smaller, it will help with resolving ambiguous Place names and event be helpful in standardizing Persona Names when the associated record does not have a valid Place.

It also has the added benefit of being useful for finding collections based on geography.

stoicflame commented 13 years ago

Seems reasonable. Note the Dublin Core uses the term "coverage" to address this. They also define two kinds of coverage:

A few questions:

ianstiles commented 13 years ago

Yes, RecordCollection needs a Date also that uses a range form (e.g. 1920-1930).

Also, RecordCollection needs a default language (String lang) like we used in Record, that way if it isn't specified in the Record we can default back to the collection's.

As for coverage at the record level, that should be supplied by the associated Event's Date and Place.

carpentermp commented 13 years ago

Note that in SoRD we also had the notion of "Record Type" as part of coverage. This would not have been dealt with in Dublin Core since this is primarily a genealogical notion. Also, SoRD Records had "coverage" in their metadata. It is generally possible to derive the coverage of a record from the record data itself as long as you have a "primary event" where you can get the date and place and EventType (corresponding generally to RecordType). The coverage of Collections and Records is becoming more important now that we are planning features that use it more explicitly. We are adding the ability to filter searches by their coverage.

stoicflame commented 13 years ago

I started thinking about modeling this some more, but it doesn't make sense to me to use the same "Place" object to define spatial coverage of a record. The "Place" object has a bunch of stuff that doesn't make sense as part of the definition of the spatial coverage of the record.

Wouldn't just adding a "coverage" element to collection, with strings for "spatial" and "temporal" be enough?

ianstiles commented 13 years ago

As I see it, org.gedcomx.record.Place and org.gedcomx.record.Date have exactly what we need if it is added as a property to org.gedcomx.record.RecordCollection.

carpentermp commented 13 years ago

I agree wit4h Ryan that "coverage" makes more sense. Also, we need to think about whether or not "record type" belongs as a "coverage" notion as well.

stoicflame commented 13 years ago

As I see it, org.gedcomx.record.Place and org.gedcomx.record.Date have exactly what we need if it is added as a property to org.gedcomx.record.RecordCollection.

And what is "exactly what we need"? What are your requirements? Why can't they be met with a simple string?

I think that org.gedcomx.record.Place and org.gedcomx.record.Date don't fit very well into the notion of a collection. Those two classes are each instances of "field" which was designed to capture the result of indexing a specific piece of a record. Do you really need the notion of an original, interpreted, and normalized version of place and date? Do you really need the notion of date and place parts? Do you really need the notion of who contributed this date and place? Do you really need the notion of a "label" to the place and date? Do you really need the notion of linking the date and place to their source?

And it doesn't make sense to me to add properties named "date" and "place" to collection. Those properties apply to events (things that happened) and a collection isn't an event. (What exactly is the notion of a "collection's place"? The place it was published?) So unless I misunderstand you, you're requesting the ability to describe the geographic and temporal "coverage" of the records in the collection.

stoicflame commented 13 years ago

Also, we need to think about whether or not "record type" belongs as a "coverage" notion as well.

I don't disagree that the notion of record type belongs on a collection, but I hesitate to refer to it as part of the "coverage". Dublin Core pretty specifically defines coverage as "time or space". The Dublin Core "coverage" property applies to an instance of "LocationPeriodOrJurisdiction".

But I don't feel too strongly about it. We could define our own notion of coverage that extends Dublin Core's coverage and includes a record type.

carpentermp commented 13 years ago

Yeah, we could just put record type on collection, though it could be considered conceptually another facet of "coverage". I don't have strong feelings about it either.

stoicflame commented 13 years ago

applied at 789b36e. spatial, temporal string were added as was recordType.