AAVLD-USAHA-ITStandards / eCVI

eCVI Data Exchange Standard (Starting with version 2)
12 stars 9 forks source link

Invalid XML files - How should these be addressed? #70

Closed StaceySchwabenlander closed 1 year ago

StaceySchwabenlander commented 3 years ago

Good afternoon. MN has had some recent discussions with our CVI processing software developers regarding issues we are seeing with invalid XML files entering our system. I suspect every state animal health official office is in a similar position of occasionally receiving invalid XML files sent by eCVI providers.

I feel the eCVI provider should ultimately be held responsible for validating XML files are valid BEFORE they are sent from their system and if they are not, making the necessary corrections so they validate before they are sent.

I bring this up for discussion by the group, to see if our working group would have any role in this?

I believe this issue is likely not something we can build into the XML standard (ie - the requirement for the XML to be validated before sent). I have a feeling this is something that may need to be addressed by NASAHO's eCVI Standards Subcommittee (for those providers who have undergone review and been found consistent with the standards that subcommittee developed). This would only leave eCVI providers that haven't undergone review, which SAHOs would then have the ability to choose to reject as official and no longer receive their eCVIs (if they have the ability/authority to do this).

mkm1879 commented 3 years ago

My take is that if it doesn't validate, it isn't an AAVLD/USAHA standard data file. Maybe we need wording to the effect, "Every eCVI sent will be a valid AAVLD/USAHA eCVI XML eCVI document."

Early on, this committee elected to punt on the enforcement part. The standard stands on its own legs as a machine processable test of validity. An XML document is valid or it isn't.

StaceySchwabenlander commented 3 years ago

@mkm1879 That was part of the discussion we had this morning - that an XML file is valid or it is not. There is no middle ground.

In other words, if the file doesn't validate, it isn't being sent according to the schema this group created. Does this indicate, then, that this group wouldn't have authority to do anything? Am I interpreting your thoughts correctly?

mmcgrath commented 3 years ago

In the strongest terms I can muster, I agree with @mkm1879 - "if it doesn't validate, it isn't an AAVLD/USAHA standard data file"

Anyone who has the ability to generate XML has the ability to verify that the generated XML is valid against a schema. Recipients of invalid XML should have a clear conscience about simply not accepting it because the sender should have verified it before sending.

StaceySchwabenlander commented 3 years ago

If we, as a working group, tried to build in a 'requirement' that those who elect to send eCVI data using the schema validate the XML BEFORE it is sent, could this inadvertently result in eCVI providers no longer utilizing the schema (i.e. would it scare providers away)? I admit I don't know what it would take for a provider to build in everything needed to ensure veterinarians can't enter something goofy that would cause the XML to not validate.

Clearly we all want the schema to be used and the resulting XML files to always validate. How do we get there without scaring away providers or veterinarians transitioning to eCVIs?

I would be curious to hear from some providers in the working group on how difficult it would be ensure all XML is validated before it is sent.

rmunger commented 3 years ago

If it's not validated then what's the point of having it? What is considered using the XML schema if software providers are allowed to add values that are not allowed? What good is a message that has garbage data and cannot be imported into a receiving system which is expecting the standard? I was under the impression there was a review board that "approved" providers of electronic ICVIs, if the review does not take into account the proper use and validation of the schema, which then ensures messages are consumable by other systems, what is the purpose of the review?

It is not difficult to add a validation before the message is sent. It is more difficult to add validation at the time of data entry. We take a blended approach and validate some items at the time of entry, making some fields required or by dropdown values and then other data elements are validated at the time of completion/issue, which then triggers submission.

Randy D. Munger, DVM Mobile Information and Animal Disease Traceability USDA APHIS VS Strategy & Policy Center For Informatics 2150 Centre Ave. Bldg. B Mail Stop 2E6 Fort Collins, CO 80526 Office 970-494-7339 Mobile 970-217-1432

From: Stacey Schwabenlander @.> Sent: Wednesday, July 14, 2021 3:58 PM To: AAVLD-USAHA-ITStandards/eCVI @.> Cc: Subscribed @.***> Subject: Re: [AAVLD-USAHA-ITStandards/eCVI] Invalid XML files - How should these be addressed? (#70)

If we, as a working group, tried to build in a 'requirement' that those who elect to send eCVI data using the schema validate the XML BEFORE it is sent, could this inadvertently result in eCVI providers no longer utilizing the schema (i.e. would it scare providers away)? I admit I don't know what it would take for a provider to build in everything needed to ensure veterinarians can't enter something goofy that would cause the XML to not validate.

Clearly we all want the schema to be used and the resulting XML files to always validate. How do we get there without scaring away providers or veterinarians transitioning to eCVIs?

I would be curious to hear from some providers in the working group on how difficult it would be ensure all XML is validated before it is sent.

- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAAVLD-USAHA-ITStandards%2FeCVI%2Fissues%2F70%23issuecomment-880236734&data=04%7C01%7C%7C549779f2b69d4571517a08d947126a00%7Ced5b36e701ee4ebc867ee03cfa0d4697%7C0%7C0%7C637618966681743797%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2BwtRFeeZmjciy5XgcYrnCw%2F202loDFxpaAX%2FuACtjp4%3D&reserved=0, or unsubscribehttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAHKG6FKVEMQSASB23ZDQ5T3TXYB5TANCNFSM47GINUEQ&data=04%7C01%7C%7C549779f2b69d4571517a08d947126a00%7Ced5b36e701ee4ebc867ee03cfa0d4697%7C0%7C0%7C637618966681743797%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8lr3nj%2Bs1LxsMrgFfLDQ7NYLG9OpeIcqp6ATaiQ4e4s%3D&reserved=0.

This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately.

StaceySchwabenlander commented 3 years ago

@rmunger Thank you for weighing in on the difficulty (or lack there of, as it sounds) of validating XML before it is sent. This is helpful to know.

There is a NASAHO subcommittee that reviews eCVI providers that voluntarily request review. This allows the provider to earn NASAHO support that their eCVI/XML is consistent with NASAHO standards, and they are provided a logo they can display on the eCVI (if they elect) to show they have been reviewed and found consistent with these standards. But as of today, this is not yet a nationally required review (though some individual states require it), but (speaking for myself) I am hopeful it will be required at some point across the board.

One of the many standards the subcommittee reviews against is that the XML from each eCVI needs to be compliant with the XML schema. The subcommittee only has access to some sample XML which is used for the review. The subcommittee doesn't have access to all coding used by the provider to ferret out if there is a possible way a non-validated XML can get through the cracks.

I think the question comes down to, where the responsibility needs to lie to confirm that XML is valid. If it should be validated before being sent (if this is a simple task, as indicated by @rmunger, though I expect other providers may have additional helpful insights if they elect to weigh in) then does this working group have a role in this, or would it fall to the NASAHO subcommittee mentioned above? I think this will be helpful to discuss on our next call.

jconlon commented 2 years ago

I concur with @mkm1879 and @mmcgrath a document that does not validate is simply not 'a valid document'. It 'Should' be rejected at the receiving end. For the receiver to accept invalid documents and then to attempt to 'guess' and try and fix the document to ease the burden on the sender will end up breaking the chain of validity/responsibility/accountability . This practice does more harm to the standard than strict enforcement.

Regarding ease of validation - XML validation is trivial and built into most web and native client front end and backend language libraries and tools. There are even many on-line tools for submitting xml for validation. Here is one: XML Validator

MichaelJRussell commented 2 years ago

@mmcgrath:

In the strongest terms I can muster, I agree with @mkm1879 - "if it doesn't validate, it isn't an AAVLD/USAHA standard data file"

Anyone who has the ability to generate XML has the ability to verify that the generated XML is valid against a schema. Recipients of invalid XML should have a clear conscience about simply not accepting it because the sender should have verified it before sending.

I agree completely. I have zero expectation that if a poorly-formed (not conforming to the standard) eCVI exits our system, that the receiving system has any obligation to do anything other than immediately reject it.

@StaceySchwabenlander:

Clearly we all want the schema to be used and the resulting XML files to always validate. How do we get there without scaring away providers or veterinarians transitioning to eCVIs?

I would be curious to hear from some providers in the working group on how difficult it would be ensure all XML is validated before it is sent.

Great question. Others have pointed out that validating before sending is not a burden, and I agree. Where some complications arise is on validation of certain data points, and that validation has to take place before the CVI is finalized, not at the time of transmission. However, that's simply an issue of understanding what needs to be validated to the specification; ie, a date must not only be a valid (real) date, but must also fall within the allowed range as specified in the schema.

I think it's an interesting question of whether or not invalid XML should be sent at all, but that feels like a process detail to be worked out between eCVI providers and CVI database vendors. It's not immediately obvious what problem would be solved by not sending a single invalid eCVI versus sending one and having it rejected.

SusanCulpDVM commented 2 years ago

@StaceySchwabenlander this is a new issue since our last meeting, so we will spend some time discussing this today. I am not sure that the eCVI Data Standards Subcommittee can (or should) be the entity to enforce this issue so I look forward to the discussion.

ryanscholzdvm commented 2 years ago

I agree that it is the responsibility of eCVI vendors to ensure that data leaving their systems is valid, but I am concerned at the path that this discussion appears to be taken. It is not within the purview of this committee to determine what should be “rejected”. I think we also run the risk of confounding rejecting a CVI from processing it as an eCVI and instead processing it manually, with rejecting a CVI altogether. I have no problem with a database rejecting an eCVI from processing the XML and instead requiring it be processed as a manual/paper CVI, but I do not believe it is for this group to say if that should or should not be done- our job is to set the standard, not how it is enforced.

That being said, I think that we are missing the forest through the trees a bit on the larger topic. At the end of the day, even if a data file is not valid, that does not impact the primary purpose of the transmission- a CVI. I know of no legal basis to reject an otherwise valid CVI document just because the data file that is included (which at this point is a luxury, not a requirement) does not validate. Ultimately the decision of whether to accept a CVI is one between the SAHOs of the issuing and receiving states and based on their state laws and corresponding federal laws, not this group or any vendors. Until there is a legal requirement to transmit CVIs with valid XML data, the most that can be done to remedy a single CVI with invalid data is to process it manually.

mkm1879 commented 2 years ago

On today's call, it was noted that the data portion of the XML is not the iCVI. That is carried in the renderable (PDF) As long as that PDF is delivered to the state animal health authority, the CVI is valid. But if the PDF is carried in the invalid XML, it hasn't really been delivered. eCVI providers should ensure that, if the XML comes out invalid, there exists some alternative delivery mechanism acceptable to the SAHOs.

SusanCulpDVM commented 1 year ago

This issue was discussed at the November 30, 2022 meeting of the eCVI Data Standards Workgroup. Since this is outside of the scope of this Workgroup, the consensus was to close this issue and refer it to the NASAHO eCVI Approval Committee.