eml-constraint changes needed - Githubissues

mbjones commented 7 years ago

Author Name: Matt Jones (Matt Jones) Original Redmine Issue: 486, https://projects.ecoinformatics.org/ecoinfo/issues/486 Original Date: 2002-05-01 Original Assignee: Matt Jones

Changes as decided upon at the Sevilleta EML meeting, April 24-25, 2002: Responsible: David, Peter, Matt

1) change entityType into separate elements for each constraint type so that the various types are enforceable. Base it on the ASU model for this. For example, there will be a "foreignKey" element that defines the content model for foreign keys, indicating which fields are required

I added Peter and myself to this to make sure we got it right.

mbjones commented 7 years ago

Original Redmine Comment Author Name: David Blankman (David Blankman) Original Date: 2002-05-15T21:38:52Z

Here is a 99% complete version. I need to document "constraint name" and "constraint description"

Renamed the "foreignKey" element to "foreignKeyRelationship".
It seemed to me that what was being expressed was not a description of the foreign key but rather the relationship between two entities.
Changed the name of mandatory to existence

Existence is the database industry standard term.
Changed the type from boolean to string with a restriction/enumeration of "mandatory" or "optional"
Added relationshipType element with type string with a restriction/enumeration of "identifying" or "non-identifying".
I also changed cardinality to a choice of simple element with an enumerated list of all of the possible cardinalities except for 1 to exactly N or a complex element "cardinality1toN" which is a sequence of "parentOccurances" and "childOccurances"

Here is an example of the output of ER Studio's relationship report:

Relationship Name FK_CustomerCustomerDemo

Relationship Type Identifying

Parent Entity CustomerDemographics

Child Entity CustomerCustomerDemo

Cardinality One To Zero or More

Existence Mandatory

mbjones commented 7 years ago

Original Redmine Comment Author Name: Peter McCartney (Peter McCartney) Original Date: 2002-05-16T23:38:42Z

I've looked at the attachment and have talked with david as well. Here are some comments:

1 - im not sure it clarifies anything to change the name foreignKey to foreignKeyRelationship. We've already had debate online whether or not its appropriate to use one element to describe all kinds of relationships, regardless of how they are implemented , but weve been working under the assumption that all relational information between two entities will be under constraint/foreignKey

2) theres still some ambiguity about how relational pointers are being handled - entities are being referred to by an id, but their attributes are being refered to by their names. while idref pointers are superior for processing, im concerned about readability of the xml files - should we be doing both names and pointers? perhaps if i understood idref better i wouldnt be asking this.

3) lists of attributes participating in keys should have explicit squence_id attributes. I dont think we should rely on the order of entry to match up fields

4) the information in "existence" and "cardinality1toN/parentOccurences" and "cardinality" is redundant. wouldnt it be simpler to drop both existence and cardinality and rename "cardinality1toN" to "cardinality". define an enumeration of 0,1 for parentOccurences and set child occurences to integer. if child occurences is missing, then we can assume it is unrestricted.

5) the module is designed to be linked at the dataset level. that is convenient for foreignKeys since we want to look them up by either participating entity. it seems awkward to have to use a relational pointer for the remaining constraints as they act only on the one entity anyway and could be nested right there. 6 of one 1/2 dozen of the other, but i just thought id mention it for the record.

6) what about indexes? are we doing anything there?

mbjones commented 7 years ago

Original Redmine Comment Author Name: Matt Jones (Matt Jones) Original Date: 2002-05-17T15:33:44Z

1) I agree with Peter -- lets call it foreignKey. 2) the root element should be "constraint", not "eml-constraint", so you'll need to fold "identifier" into constraint 3) You need entityid in primarykey, uniquekey, and check as well as foreignKey, so you might as well make it part of ConstraintDefinitionBase. However, the packaging changes will likely cause a modification to this, which I'll take care of as part of the packaging bug changes for all modules. 4) as far as pointers go, if we were to choose idref to point among the attribute names, the only requirement is that it points to an id that is unique within the document. so as long as two different attributes did not share a name (which is probably a bad assumption), we could use the name as an idref pointer. This stuff will all get resolved in the packaging discussion which I'm going to launch today with a proposal. 5) as Peter said, existence is identical to parentOccurences. I like his proposed changes (eliminate existence & cardinality1toN, create enumeration for parentOccurences, set childOccurences to integer) 6) I've noted the desire to nest constraint under entity. I'l address this in the packaging note. 7) indexes. they are not really constraints, but we could list them here if desired as an additional type. no need for some purist philosophy as long as it isn't confusing. what would we need: tag with the same content model as uniqueKey, right?

mbjones commented 7 years ago

Original Redmine Comment Author Name: Peter McCartney (Peter McCartney) Original Date: 2002-05-17T16:57:03Z

We at one time made a separate model for indexes as they are also redundate with , but different from, keys. Keys can be indexed or not in some databases, and indexes can be built on field combinations that dont produce a key. The information is typically returned from most reverse engineering tools, but we put it on hold because knowlede of indexes doesnt really affect your ability to construct a query, it just affects performance.

I think if we turn to EML as a data design language, we may want finer granularity in this whole area but for now would leave index to a future module. At most, I might consider putting a boolean attribute called indexed under each key as a hint to the potential performance of a query involving that key.

mbjones commented 7 years ago

Original Redmine Comment Author Name: David Blankman (David Blankman) Original Date: 2002-05-24T18:15:18Z

Removed "existence" element. Deleleted element "Cardinality" Renamed "Cardinality 1 to N" to Cardinality. Created new type "CardinalityChildOccurancesType" as a union between a simple xs:integer, and an xs:string with a restiction of the text "many". Put entityID as a part of ConstraintDefinitionBase. Renamed "check" to "checkConstraint".

mbjones commented 7 years ago

Original Redmine Comment Author Name: Matt Jones (Matt Jones) Original Date: 2002-05-24T19:29:07Z

David -- the element "ConstraintBase" is not needed. I eliminated it, and checked in the modified version to CVS.

mbjones commented 7 years ago

Original Redmine Comment Author Name: Matt Jones (Matt Jones) Original Date: 2002-06-14T01:56:39Z

Changes completed, including the re-addition of the not NULL constraint. Checked into CVS.

mbjones commented 7 years ago

Original Redmine Comment Author Name: Owen Eddins (Owen Eddins) Original Date: 2002-06-27T16:25:08Z

I'm passing the following comments from Tim Bergsma the data manager at Kellog Biological Station in Michagen. He made them in a eml-dev email. I posting to bugzilla just to make sure they don't fall through the cracks.

Spelling: parentOccurances and childOccurances should be -Occurences.

mbjones commented 7 years ago

Original Redmine Comment Author Name: Matt Jones (Matt Jones) Original Date: 2002-08-30T15:07:05Z

Fixed spelling issues. RESOLVED FIXED.

mbjones commented 7 years ago

Original Redmine Comment Author Name: Redmine Admin (Redmine Admin) Original Date: 2013-03-27T21:14:26Z

Original Bugzilla ID was 486

NCEAS / eml

eml-constraint changes needed #51