Closed rwshuka closed 7 years ago
The most convincing argument here is, I believe, the enumerated value lists. I believe there is room to compromise there and still be a closed standard. I'll have to look at how we would deal with that. The most obvious would be to enumerate only purely structural elements where no other value would make any sense. That is the direction Michael is pushing us on Species/Breed. Other standards handle this by allowing alternate code or original text values as alternatives or in addition to the standard value list.
As to the less sophisticated business partners, I know this is a challenge but I believe it makes more sense to help them through dealing with the closed standard because then they are set. If we have N different variations on an open standard, we really have only partly solved the O(N^2) problem.
We got very bogged down in this important issue during our discussions in Denver. As a first step towards resolution, I want to state that I do not believe this has to be "one size fits all" -- for example, we might decide that Species will be an enumerated list but Breed is a free-text field.
Taking Species and Breed separately:
I see these options:
I would vote for option 1.
I see these options:
I vote for option 4 for Breed as I don't think it is "important enough" to enough people to go the enumerations route; I would however propose we maintain a mapping dictionary for common breeds and species to establish an "opt-in" approach to coding this data. In other words, we can say that "AN"=>Angus without enumerating it - this still allows for an originator of an XML message to provide anything they want when faced with recording details of a exotic/unusual breed....
This is a major issue and we need to see input from all parties.... thanks to @rwshuka and @mkm1879 who have already weighed in with detailed thoughts...
I have thought from the beginning that this would eventually prove to be the hardest part of this standard. It is a very important decision.
Michael's suggestion fits more or less with an approach I've floated over the last year or so that taxonomy be in two fields "True structured taxonomy" (what we are calling "species" here) and "Additional taxonomy detail" or something like that. For this to work right everything needed to make system logic work has to be in the first field. For ADT this might be problematic unless we have a "species" that includes all the dairy cattle breeds and another with all the non-dairy breeds.
This is NOT an easy design decision. Everyone please think deeply about this.
Take a look at this for extending enumerations: http://www.ibm.com/developerworks/library/x-extenum/
A suggestion: Take a look at solution 3 in the above referenced document.
If instead of using a regex to specify extended values to be distinguished with 'x:' we use a regex to specify an URI. (Note: xsd:anyURI could not be used here, because it is not a string.)
URIs themselves are schema based, you can even say they are simple modeled data types. This way any two or more partners can extend our enumerations with unique partner defined URIs (by restricting the regex) and the resulting instance documents will validate against our base scheme yet that same document instance may or may not validate in the partners extended schema.
This has been stagnant for a number of years and will be closed on 2 Sept unless objections are raised. Any specific amendments should be opened as separate issues.
First of all I apologize for my lack of participation in the standards development process to date. I had scheduling conflicts for the first couple meetings, then I was out for a couple weeks due to personal issues and I've been playing catch-up for the past week. Thank you for allowing Kaylen to stand in for me in my absence.
One of the conversations that it appears that I missed along the way was a discussion regarding whether to develop an open or a closed standard. Maybe that ship has already sailed so this may be a moot point but I wanted to at least put our position out there on the chance that it may not be to late for consideration.
Specifically I want to discuss the issue of enumerated values ("restriction" definitions in the xsd). GlobalVetLINK is in favor of a more open standard with companion documentation that enumerates "preferred" values.
My assumption is that our desire would be to define a standard that would be as widely adopted in the industry as possible. Thus we want to define a standard that doesn't discourage parties from implementing it. Our experience has been that, while it places more of a burden on us as a company to validate the data that we receive, openness encourages adoption of a standard.
Our experience has been that a closed standard discourages adoption in the following ways:
I believe I understand the reasons behind the desire for a closed standard, but our practice has been to find other ways to address those issues. I assume that one of the main objectives of having a closed standard is to control the quality of the data. Another possible way to address this goal would be to provide a "test suite" or possibly even a test web site where parties could upload files to determine their level of compliance. An example of this would be the Java programming language. In essence, Java isn't actually a language but rather an open standard. There are several vendors that provide implementations of this standard and there are programs available that will test and report on the level of compliance of each implementation.
Once again, I'm sorry for this late entry into the discussion, and if we are already beyond the point of considering a more open standard then I apologize.