Closed stoicflame closed 13 years ago
Enhancing my comments on issue #72, I think there might be a misunderstanding of what the record model is designed to do. The GEDCOM X record model is designed to be another type of evidence that is a peer to other types of "non-conclusionary" evidence such as images, web pages, and physical artifacts. It's designed to model indexed record data. It's designed to be the output of indexing generic record data. It's not designed to be the "leaves" of the n-tiered model, it's designed to be cited as (another type of) evidence for those leaves.
The record model is not designed to model generic genealogical evidence, it's designed to be a unique type of evidence. So to use the word "evidence" to describe the record model would be inaccurate and misleading.
I think "Record" and "Field" are not "unfortunate" or "confusing" names at all. "Record" is the generic term used to define a set of data that was extracted directly from a generic record. There are many different types of records (census, birth, probate, military), and the "record" was designed to be flexible enough to model all of them. "Field" is a generic term that is used to define a "piece" of that record, such as a bounding box for an image. Yes, "field" is very much akin to a field of a form. This is seems pretty accurate to me since indexers are usually presented a UI that has a set of "fields" that they are to index. I'm not sure what's confusing about this, developers and users alike should understand "field" as such.
I don't think there is confusion about what GEDCOM X calls records. I would define the GEDCOM X record as digitally extracted information from a physical item of genealogically significant evidence, and many of those physical items of evidence in the genealogical context are called records. Where you call the process of converting from a physical artifact to a digital record the "indexing" process, I call it the "extraction" process. For me "indexing" implies you are only digitizing enough of the physical artifact to be able to search for it or summarize it in some sense; for me the "extraction" process implies you are digitizing everything you possibly can from the artifact that fits into a full model of genealogical data. Since the GEDCOM X indexing process does squeeze every Persona, every Event, every Relationship, and so on, that it can from the physical artifact, no matter whether you call it indexing or extracting you are getting all the juice you can from it. Which I think is fabulous and marvelous.
My only point was that the words "record" and "field" are highly overloaded in the computer field. When one hears the term "record" one usually assumes a record in a computer database, and one hears the word "field" one usually assumes a sub part of a computer record or a component of a structured data type. So my point was only that there are some possible confusions one must anticipate with those two terms, not any disagreement about the concepts at all.
(And of course, by digitizing I only mean converting the information on a physical artifact into a form of structured data that adheres to a data model and can be processed by algorithms.)
I would define the GEDCOM X record as digitally extracted information from a physical item of genealogically significant evidence, and many of those physical items of evidence in the genealogical context are called records.
That sounds like a pretty accurate definition to me, too.
For me "indexing" implies you are only digitizing enough of the physical artifact to be able to search for it or summarize it in some sense; for me the "extraction" process implies you are digitizing everything you possibly can from the artifact that fits into a full model of genealogical data.
Fair enough. I can use the term "extraction".
My only point was that the words "record" and "field" are highly overloaded in the computer field.
Maybe so. But as a software engineer, I don't know when the last time I used "record" was. What is a "record" in the computer field, anyway? I have used "field" to describe fields on a form or fields on a data type, but it's always used in context.
We are starting to beat the dead horse into the ground here, especially as I am content enough with the names Record and Field to be used for actual object classes in the GEDCOM X Record model.
But I am surprised that you don't understand my reasons for bringing up this issue, and especially that you don't see the confusing overloading caused by GEDCOM X's use of the Record and Field objects.
The word Record has come to mean any self-contained data structure in almost any context. Certainly all rows in relational database tables are conventionally called records. All objects stored in hierarchical and network databases are called records. The top level elements in many XML files are called records, generally because once that XML file is read by an application, those top level elements will be stored in a database as records.
You are making me feel very old!!!
Certainly all rows in relational database tables are conventionally called records.
Fair enough.
All objects stored in hierarchical and network databases are called records.
Okay.
The top level elements in many XML files are called records, generally because once that XML file is read by an application, those top level elements will be stored in a database as records.
Hmm... maybe not so much in this case. I've never heard the word "record" to refer to a top-level XML element. It's always just "element" or "root element" in my experience.
You are making me feel very old!!!
Oh no! I certainly hope not. I have no idea how old you are. All I know is that you've got a fantastic talent for articulating genealogical data models and your input is extremely valuable.
The following commentary was submitted by @ttwetmore, and I'm opening it up for discussion, my comments to follow: