historical-data / schema

Microdata schema for historical data.
historical-data.org
30 stars 4 forks source link

clarify document content metadata model #13

Closed stoicflame closed 12 years ago

stoicflame commented 12 years ago

renaming 'documentDate' to 'temporal' and 'location' to 'spatial' to clarify the meaning of those properties as describing the nature of the content of the document. renaming 'inLanguage' to 'language' to conform to convention for describing the nature of the content of the document

ninjudd commented 12 years ago

Having a field named documentDate does seem awkward to me. How about just renaming that to date. I think date and location are more intuitive in this context than temporal and spatial.

stoicflame commented 12 years ago

I agree that temporal and spatial aren't very intuitive, but they are well-defined by Dublin Core.

The problem with using date and location (or place) is that they are so often misused and the cause for a lot confusion and a lot of frustration among professional researchers. (Trust me, I get frustrated professional researchers sit me down, pat me on the head, and explain this to me over and over and over again just to make sure I understand.)

Let's take the 1880 US census as an example. It was published originally (around 1880), but it has been republished by multiple records archives all over the world. And then those republications were published again by other providers all over the world. So the scenario is that if you're providing semantic markup for an HTML page (published May 2011) that describes a 2010, London, England republication of a 1980, Boston, MA republication of the 1880, Washington, D.C. original publication of the 1880 U.S. census, what are the values of date and location?

That's why Dublin Core has so many different terms describing the concept of date and place. There's "created", "issued", "modified", "available", "date", "dateAccepted", "dateCopyrighted", "dateSubmitted", "extent", "spatial", "temporal", etc. Each one of those has a specific meaning.

Since what we're trying to describe with these properties is the date and place that the document "pertains to", i.e. "coverage", I suggest we use the terms that are already established among the archive and records communities for defining them. Then we're clear: in the example above, the only thing that stays the same between publication and republication of the document is the "temporal" (1880) and the "spatial" (US) coverage.

ninjudd commented 12 years ago

I see. So the fields temporal and spatial would only be used for HistoricalDocument, not for dates and locations on events.

stoicflame commented 12 years ago

I see. So the fields temporal and spatial would only be used for HistoricalDocument, not for dates and locations on events.

Yes, that's the idea.

For the record, I don't feel too strongly about using the names temporal and spatial. I, too, think they're awkward. But they are also very well-defined and that's what I care most about. Whatever names we choose, we have to be very explicit about what they mean and make sure that we clearly state the difference between e.g. an event date and place.

NatAtGeni commented 12 years ago

I'd rather stick with date and location.

ninjudd commented 12 years ago

I propose we use date and location but make clear in the Description that these should be the date and location when the document was originally created. We can also indicate in the Description field that these correspond to temporal and spatial from Dublin Core.

stoicflame commented 12 years ago

I've applied the recommended naming changes and documentation clarifications. Note that I'm using the plural form to be clear that there can be multiple dates and locations applicable to a document.

What do you think?

NatAtGeni commented 12 years ago

I'm not sure how that would work - how would a document be created at more than one time at more than one location?

ninjudd commented 12 years ago

Is there any value in using creationLocation and creationDate for clarity?

stoicflame commented 12 years ago

I'm not sure how that would work - how would a document be created at more than one time at more than one location?

Wait, I thought we had established that this isn't about the time and location of the creation of the document, but it's about the time periods and places that are covered by the contents of the document. And a census, for example, can cover all fifty states, i.e. multiple locations. An obituary can cover the entire lifetime of a person, i.e. multiple dates.

stoicflame commented 12 years ago

Is there any value in using creationLocation and creationDate for clarity?

I don't have a problem adding "creation" or "publication" information to the document, but this information seems less important than the "coverage" of the document. Users are going to be searching for documents about their ancestors that lived in Boston, MA in 1880 more than they're going to be searching for documents that were created in Washington, D.C. in 1882.

No?

ninjudd commented 12 years ago

Ah. I misunderstood. My comment above was:

I propose we use date and location but make clear in the Description that these should be the date and location when the document was originally created.

But now I see that is not what spatial and temporal mean in Dublin Core.

ninjudd commented 12 years ago

Agreed. Users are much more likely to search for coverage than creation.

NatAtGeni commented 12 years ago

Sorry, I missed that the definition of that had changed.

stoicflame commented 12 years ago

So... am I okay applying this change? Do we need more discussion?

ninjudd commented 12 years ago

Yes.

ninjudd commented 12 years ago

That statement was ambiguous. Yes, I think you can apply it.