tracefirst / usaha_committee

XML schema for electronic CVIs
8 stars 2 forks source link

InternalRefNum In Accession and Tests #12

Closed mkm1879 closed 10 years ago

mkm1879 commented 10 years ago

Why is this not ID and IDREF so that the schema would enforce it?

mmcgrath commented 10 years ago

That's evidently beyond our XML schema expertise! If there's a way to enforce this integrity within the schema, we don't know how to do it.

Our approach would be for consumers of XML eCVIs to perform this consistency check before processing the XML file.

If you can advise on how to achieve this within the schema, I'd strongly support that approach.

mkm1879 commented 10 years ago

IDREF is just a string that is a legal NSName that must match an ID in the same document. What this does is allow the schema validator to check that every reference to an accession points to an actual accession.

The only issue with switching would be the integers are not NSNames. I added a letter in front of the numbers in my sample document and it works.

I don't know how to post example files and this needs complete files to make sense. See IDREF_Example.xml and IDREF_Example.xsd on our Google Drive. I have an intentional error a Test referencing "x2" when the two Accessions are "x0" and "x1"

https://docs.google.com/file/d/0B1tTjQdoxnViOEFSVHNmVm85VHc/edit?usp=sharing https://docs.google.com/file/d/0B1tTjQdoxnVia0pWeVByWXBaU3M/edit?usp=sharing

mmcgrath commented 10 years ago

We looked at this early on and worked out the ID/IDREF thing -- however, we were stumped by the inability to have numeric references....

I'm torn between "its best to have as much validating logic in the schema as possible" and "having numeric IDs might be a 'must have' for some participants..."

mkm1879 commented 10 years ago

What our LIMS does is use a "formatted accession number" that is really just a lab constant prefixed to a numeric accession number. They didn't do that for ID/IDREF but for human readability. But something like that could work. I've always thought that the ability to validate against a schema was the next best thing after compiler warnings. But just as lots of programmers turn off warnings I've found frightfully few labs that use validation seriously. So electing integers because they are easier to autoincrement may be the right choice. Let's see what others say, if anything.

mmcgrath commented 10 years ago

7/17 - @mmcgrath and @sglCO agreed that doing validation in the schema is the superior approach.

More input needed - this is an important decision

scottrydberg commented 10 years ago

Mike and Michael, This is a great conversation but maybe a little to deep in the technical writing for the average non-tech person. I think I understand the point of what is being said here. It would be up to the eCVI developer to put in language or checks into the program to validate these accession number - correct? I don't know how many types of accession numbers there are out there in the lab world. Could it not just be a open field to accept what ever type of data is placed in the field, either number or alpha? I don't know if it is our place to validate information (accession #) from labs? Maybe I am off base here and am not understanding the conversation.

mkm1879 commented 10 years ago

As an internal reference number all this field is for is to make sure each Test can be tied to its Accession. If the value used has any additional meaning, that is up to the eCVI implementation. Making it an ID/IDREF allows us to enforce that every test has a matching accession. The only real restriction is that the ID used is a valid name (Starts with a letter, no spaces, etc.)

mmcgrath commented 10 years ago

I think one of the next things we need to do is take one or two real CVIs and start to model them in XML thats binds to our schema --- that will help in a lot of ways, including providing an example of the ID/IDREF concept too.

mkm1879 commented 10 years ago

I agree. We need as many dissimilar source CVIs to try to map to the schema so we can identify any issues; not just with ID/IDREF but cardinality, nesting, etc. Should this be its own issue?

mmcgrath commented 10 years ago

Based on the general feel that it is better to have the schema do as much validation as possible I have gone ahead and made this change.

mmcgrath commented 10 years ago

No further comments in last 3 weeks - am closing this