NCEAS / z-test-issues

Test issue imports from redmine
0 stars 0 forks source link

eml-constraint overlaps with packaging concepts #256

Open mbjones opened 7 years ago

mbjones commented 7 years ago

Author Name: Matt Jones (Matt Jones) Original Redmine Issue: 428, https://projects.ecoinformatics.org/ecoinfo/issues/428 Original Date: 2002-02-14 Original Assignee: Matt Jones


The current incarnation of eml-constraint allows the enumeration and definition of integrity constraints that apply to entities. These are currently drawn from the relational model, including UNIQUE, PRIMARY KEY, FOREIGN KEY, and CHECK constraints. It may also be extended to include other types of relationships between entities that are not part of the relational model.

The "triple" element allows us to create arbitrary relationships between identifiable objects in EML, and is used for associating data with metadata, and groups of metadata and data objects together as a "package". This usage is very similar to the relational model, in that it allows us to define 3-valued tuples in a graph structure. Constraints between entities could conceivable be modeled using this infrastructure, probably with some modifications to the concept of a "relationship".

So, the question arises. Should we try to develop a unified approach to the specification of constraints and the specification of packages? It might be more elegant, but possibly at the cost of simplicity and ease-of-use. My gut feeling is that this is not something we whould pursue, but would like to hear other people's reasons for or against it.

mbjones commented 7 years ago

Original Redmine Comment Author Name: Peter McCartney (Peter McCartney) Original Date: 2002-02-15T17:24:33Z


I hope im using this right.

We grappled with this dilemma and at one point took the indecisive solution to have both a relation.xsd in which was based on ER Studios data model for describing relations between entities and a constraint.xsd based on the existing constraint.xsd. Foriegn keys are both relationships and constraints, so this wasn't a very desirable solution. To be consistent with EML, weve basically dropped the relation module and were planning on using constraint since thats where things were going. The only other choice is to reduce constraint to only checks that reference the table's own fields and put all relational information in relation.

we also nested constraint under the table so that we dont need to rely on some kind of pointer to locate all the constraints that affect that table. one still needs a pointer however, to look up the referenced table. this makes it very fast to find all the tables that this table is dependent on, but a little more work to find all the ones that depend on it.

I dont have a strong preference over extending constraint vs re-adding relation.xsd other than to wind up with only one place where i scan to find all relationships. We should ask ourselves which question are you more likely to ask when using a dataset - how were inserts, updates and deletes to this table constrained? or how does this table join with other tables in the dataset? I think the former. another question we should ask is does the way we do this affect the potential use of EML in the future as a data modeling (as opposed to metadata) language?

at matt's request, im trying to create a merged version of this and other modules for CVS that shows the differences that i outlined in the lengty email i sent to the lter discus list, but if you want to quickly see how we envisioned these two modules as of december, you can look at:

http://ces.asu.edu/bdi/subjects/metadata/december2001/dataset/