scandihealth / lpr3-docs

https://scandihealth.github.io/lpr3-docs/
MIT License
11 stars 7 forks source link

PRODUCTION: ConstraintViolationException: could not execute batch #322

Open Remcovanderwerf opened 5 years ago

Remcovanderwerf commented 5 years ago

Hi,

Currently we are receiving the folllowing error response from LPR3 production on most of our submissions: javax.persistence.PersistenceException: org.hibernate.exception.ConstraintViolationException: could not execute batch

<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
    <soap:Header>
        <Action xmlns="http://www.w3.org/2005/08/addressing">urn:ihe:iti:xds-b:2007:DocumentRepository_PortType:DocumentRepository_ProvideAndRegisterDocumentSet-b:Fault:EJBException</Action>
        <MessageID xmlns="http://www.w3.org/2005/08/addressing">urn:uuid:6502cd60-559a-4d19-b272-eb72f03ab850</MessageID>
        <To xmlns="http://www.w3.org/2005/08/addressing">http://www.w3.org/2005/08/addressing/anonymous</To>
        <RelatesTo xmlns="http://www.w3.org/2005/08/addressing">urn:uuid:cb5c1c50-a5fc-4171-a8bd-3fdb1f17e181</RelatesTo>
    </soap:Header>
    <soap:Body>
        <soap:Fault>
            <soap:Code>
                <soap:Value>soap:Receiver</soap:Value>
            </soap:Code>
            <soap:Reason>
                <soap:Text xml:lang="en">javax.persistence.PersistenceException: org.hibernate.exception.ConstraintViolationException: could not execute batch</soap:Text>
            </soap:Reason>
        </soap:Fault>
    </soap:Body>
</soap:Envelope>

Possibly related to #318 and #306 but not confirmed.

I send an email to tnorgard@dxc.com with some example CDA's (anonymized) for investigation. submission time(s): 08-03-2019 (resending every minute) endpoint: PatientHealthcareReportingService

Please pickup with priority since it is witholding our production submissions.

Kind regards,

Remco van der Werf Enovation, the Netherlands Contractor: Region Syddanmark

TueCN commented 5 years ago

I have reproduced the error from the provided CDAs and I am investigating the cause.

Remcovanderwerf commented 5 years ago

Hi @TueCN ,

any news regarding the above? We would like to continue our production submissions.

TueCN commented 5 years ago

Yes I have identified a scenario that causes the error:

  1. Send a submission with an episode of care, encounter and procedure
  2. Send a "set replacement" (<relatedDocument typeCode="RPLC">) submission that deletes the encounter and moves the procedure somewhere else (for example to the eoc).

I am working on a fix.

binntho commented 5 years ago

I'm getting precisely the same fault, since Thursday March 7. The scenario is fairly simple: Submission of an episode of care with three procedures linked with three encounters. I can email anonymized examples if requested.

TueCN commented 5 years ago

Hi @binntho, please read our guidelines on CONTRIBUTING.md#identify-your-reporting-authority.

If you have an example that reproduces the issue which does not involve a "set replacement", then yes please send it to me on tnorgard@dxc.com

binntho commented 5 years ago

Sorry, I'm working for Aver og Lauritzen ApS, journal system GANGLION.

TueCN commented 5 years ago

1. During investigation of this bug, several related scenarios has also been identified to cause the SOAP fault: javax.persistence.PersistenceException: org.hibernate.exception.ConstraintViolationException: could not execute batch. All scenarios (so far) involves deleting X via "set replacement" and moving its Y to Z, which means:

  1. X, Y and (possibly) Z exists in the registry
  2. Send a "set replacement" (<relatedDocument typeCode="RPLC">) submission that does not contain X, but contains Y and Z, where Y now belongs to Z instead of X.

Examples:

2. Several other bugs (same root cause) were also discovered during this investigation:

Examples:

A fix that eliminates the root cause is being iterated upon, but currently our regression tests show new consequential errors from this fix, so development is still ongoing.

We are currently also in the process of writing more systematic tests to cover all possible related scenarios which will also be part of the regression testing.

binntho commented 5 years ago

Hello, I do not agree with the claim that "All scenarios (so far) involves deleting X via "set replacement" and moving its Y to Z" - I have been attempting to report episodes of care for several patients, an example of which has been sent to TueCN. All my attempts on behalf of a psychiatric clinic result in the same error "javax.persistence.PersistenceException: org.hibernate.exception.ConstraintViolationException: could not execute batch"

None of my submissions involve replacing or moving anything. I have successfully made submissions on behalf of another client (eye hospital), but not a single one on behalf of the psychiatric clinic.

I've also tried submitting on a patient that has never had a submission in LPR2 (some of the other patients may have had such submissions) and still I'm recieving the same error. My client is getting rather worried that she will not be able to make submissions.

I've sent another example to TueCN via email.

I work for Aver og Lauritzen, with the patient journal system GANGLION.

TueCN commented 5 years ago

@binntho I have not had a chance to look at your example yet as the main focus currently is getting Region Syddanmark back on track with their submissions to production.

The example scenario sent to me by OP (original poster, i.e. @Remcovanderwerf) involved set replacement and that is why the investigation has so far been primarily into that type of scenarios.

As I have stated on https://github.com/scandihealth/lpr3-docs/issues/318#issuecomment-470853015, the could not execute batch error is a category of errors, not a specific error. So it is possible your scenario leads to a different bug than the one originally reported on this issue.

@binntho please create a new issue for your scenario description and resulting error, as the bug involving set replacement has to be hotfixed in production ASAP (and thus I will likely not have time to also investigate your scenario before we push the fix to production).

TueCN commented 5 years ago

We are currently considering whether reusing an II (root/extension pair) of a deleted/to be deleted element after/during document replacement or nullify act submission, should be allowed or prevented in general.

@Remcovanderwerf is it intentional you use this strategy during document replacement? If you generate a new extension for the new procedure element you should not face this issue. However this means that the two procedures will be completely unrelated of course.

Do you reuse the extension because you expect the history of the deleted procedure and new procedure to be linked? This can be accomplished by instead replacing the procedure in a "document append" submission. The append should contain the new version of the procedure and a replacement reference to the existing procedure.

DaveSS commented 5 years ago

To chime in with input from a different organization, we definitely make use of both the “document replacement” and “document append with nullify acts” as part of our overall LPR3 submission strategy. In all cases, we expect to use the same ID for an instance of a concept over its life cycle; whether that information was submitted to LPR, revised, or changed to reference a new concept should not determine the ID. For example, if we have an encounter for a patient, there is a single ID to represent that encounter which is sent in the LPR3 document. Whether we update information on that encounter in a “document append” with a replacement reference to the encounter or we just use “document replace” to send the new updates for the encounter information shouldn’t affect the ID of the encounter. This would be a large breaking change to the document submission methods that have been tested and used over the past year and it is alarming that this is being proposed for a system currently in production use.

jonigkeit commented 5 years ago

@DaveSS I can assure you that the semantics of the interface are not about to be changed.

jonigkeit commented 5 years ago

@DaveSS I can assure you that the semantics of the interface are not about to be changed.

TueCN commented 5 years ago

Thank you for the input Dave. I am not sure what you mean by "whether that information was [...] changed to reference a new concept should not determine the ID". LPR cannot know what you regard as an instance of a concept on your side. LPR can only act on the information provided and guarantee that the ID you provide will be the unique ID for that document element throughout its lifecycle in LPR. The question is, can you call it the same element when you are not using a replacement reference, but instead deleting it and recreating it? Is it not reasonable to say its lifecycle ended when it was deleted? If yes, then another question is whether it should be allowed to create a new element with the same ID as another deleted element has, i.e. reusing a dead ID.

In any case, the problem right now is that "document replacement" functionality is inconsistent. Sometimes it allows reuse of ID and sometimes it does not. It seems to work fine when the replaced element is "reused" without changing any references to other elements, but instead just has its own information updated. It seems to go horribly wrong when the "reused" element is at the same time updated to belong to a different parent element (see my examples from https://github.com/scandihealth/lpr3-docs/issues/322#issuecomment-472645718) .

This would be a large breaking change to the document submission methods that have been tested and used over the past year

Yes preventing "reuse" of IDs would be a large breaking change, and if the decision is to not support it, then maybe a better solution would be to just leave the functionality as it is (with the current behavior in effect). I think the current behavior has been in place in test for a very long time. It must somehow be a non-issue for you then.

The situation is that LPR was not developed with this reuse in mind.

For example, currently deletion of an episode of care is implemented as a "soft delete" which means that the EoC itself and all its sub elements (e.g. encounters, those encounters' condition observations, those condition observation's observation organizers, etc.) are preserved in our "active" dataset and just marked as deleted. Deletion of any other element is implemented as a "hard delete" which moves the element to a separate "history" dataset. The "history" dataset is only ever accessed if a previous version of an EoC is displayed in the GUI. This difference was made because it was a requirement that users can search for and view deleted EoCs in the GUI, and we anticipated that most users of this function would only want to view the latest version of the deleted EoC. By keeping the deleted EoC and its subelements in the "active" dataset it is much faster to search for and view.

Since we use the extension as a unique ID, it is not possible for us to at the same time have e.g. an encounter that belongs to a "soft deleted" EoC and another encounter with the same ID that belong to another EoC.

TL;DR: We can currently only support a limited reuse of IDs. We have already partly implemented a hotfix that increases the number of supported reuse scenarios, but we cannot reach 100% scenario coverage without having to rework large parts of the system.

EDIT Typing makes you think. After writing out this post I realized that the problem with reusing ids from elements of a "soft deleted" EoC is maybe not a problem. One could consider it acceptable that the reused element is brought back to life and removed from the dead EoC (if one is fine with updating a deleted EoC). It sounds like everybody wants to reuse IDs and I will continue to chip away at the problem.

TueCN commented 5 years ago

We are currently working on a solution that will consistently allow moving an element from one parent to another during document replacement.

Note: This means that you, as currently, cannot nullify an EoC ID ("appending the binder") and then in a later append message create an EoC with the nullified ID.


Note: The following inconsistencies has been found, which are deemed out of scope for this particular issue:

  1. It is possible to create an element with a previously deleted ID, except when that ID is for a deleted EoC or one of the deleted EoC's subelements, regardless whether the EoC was explicitly nullified or implicitly removed via document replacement
    • It should probably not be possible to create elements with a previously deleted ID , except after a document replacement ("empty the binder") where it then should be possible to use any previously deleted IDs (even EoC IDs)
  2. It is possible to replace (update) elements that was previously deleted, if they were not explicitly deleted through a nullify act for that particular element. That is, elements indirectly deleted through an EoC deletion can be replaced (which updates the deleted EoC)
    • It should never be possible to use a replacement reference to elements that has previously been deleted
  3. It is not possible in an append message to move an element from a parent being nullified in that same message, to another parent, using a replacement reference, except when the parent being nullified is an EoC.
    • It should always be possible to use a replacement reference to move an element before it is implicitly deleted by a nullify act.

The intended behavior of these inconsistencies may be included in the solution, but only those that are not imposing more restrictions than already exists. If anyone has objections to the proposed intended behavior they should speak up now.

The above uses the "binder" metaphor, where a CDA set is seen as a binder (ringbind) that can have pages (CDA documents) inserted on top of existing pages, or the binder can be emptied of all pages.

Remcovanderwerf commented 5 years ago

We are currently considering whether reusing an II (root/extension pair) of a deleted/to be deleted element after/during document replacement or nullify act submission, should be allowed or prevented in general.

@Remcovanderwerf is it intentional you use this strategy during document replacement? If you generate a new extension for the new procedure element you should not face this issue. However this means that the two procedures will be completely unrelated of course.

Do you reuse the extension because you expect the history of the deleted procedure and new procedure to be linked? This can be accomplished by instead replacing the procedure in a "document append" submission. The append should contain the new version of the procedure and a replacement reference to the existing procedure.

Hi @TueCN,

my apologies for the delay in my answer. To answer the question above (I had to ask the region myself); "We do not reuse element id's. When we send the same id for a procedure is it because it’s the same procedure, even though it might have been rearranged. (moved from an encounter to a EoC)"

Does this answer your question?

Anyhow; the implementation strategy of Region Syddanmark is only replacing full EoC's. Not appending, not nullifying.

DaveSS commented 5 years ago

We, like others, are currently encountering issues when removing and/or resending elements that we did not expect based on the LPR3 CDA documentation and workflows we tested. However, GitHub does not seem the appropriate venue for proposing and vetting major changes to the basic behavior of the database receiving LPR3 documents. With regards to the comments you made above, we would like to express our understanding of how we should be able to interact with the database according to the current published documentation:

  1. When removing an element from DXC, it does not necessarily mean that the LPR3 object represented by that element was deleted in the EMR. The UUIDs documented in the LPR3 specifications are tied to LPR3 objects (for example, a specific kontakt that occurred for a patient) so removing an element with that UUID from DXC doesn’t change the fact that the object (kontakt) with that UUID may still exist in the EMR. We could have submitted an element to DXC, and some time later, a user decided that it was not reportable, so we need to remove that element from DXC. If the reportability of that element had been changed in error or if it was later determined to be reportable, we would expect to be able to re-send that element with the same UUID (because it is the same LPR3 object). The fact that we previously sent that element and removed it from DXC is irrelevant; the element is still identified with the same UUID. Since DXC is a repository for reporting LPR3 information, we would expect that it would support LPR3 elements to be sent, removed, and resent without requiring a system reporting the data to artificially delete and recreate clinical information in order to satisfy the database design of the repository.
  2. How we submit/remove that element to/from DXC should not change the behavior of future submissions/removals of an element. DXC provided documentation on multiple methods that sending systems could use to report data but did not state that the method used to remove an element will affect how the database processes that removal. The idea that removing an element using the nullify functionality (which is the method available in the append methodology), works differently than removing that same element by excluding it from a new document submission (which is the method available in the replace methodology or by “emptying the binder”) would be a major undocumented change to how sending systems need to interact with the DXC system.
  3. The type of element submitted/removed to/from DXC should not factor into the behavior of future submissions/removals of that element. Making the treatment of EoC elements different from the treatment of other element types is not indicated anywhere in documentation would affect how systems interacting with DXC are designed.

Based on these general expectations, the intended behavior detailed in points 2 and 3 of post https://github.com/scandihealth/lpr3-docs/issues/322#issuecomment-473267136 seem to fall in line with our expectations, but the intended behavior of point 1 seems very worrisome. DXC is not the originator of the LPR3 data, and it seems like DXC should accept valid LPR3 data being submitted according to the published standard. The ID of an element is intrinsic to the LPR3 object, not the DXC database model.

TueCN commented 5 years ago

@Remcovanderwerf yes it answers my question. You consider the elements to be logically the same (like @DaveSS does). As stated earlier, we are working on a solution that will consistently allow moving an element from one parent to another during document replacement.

TueCN commented 5 years ago

@DaveSS I can not think of a better forum to put ideas forward and solicit feedback. I am very much for an open, honest, and transparent discussion of problems and solutions. Maybe no major game changing decision should happen based on GitHub comments, but ideas can be formed and taken to other fora for vetting.

The following is my remarks to your bullet points:

  1. I can see where you are coming from, and understand your scenario. Technically your scenario can be fulfilled using document replacement to wipe out previous nullify acts.
  2. I think the problem is that we in LPR regrettably have not precisely and definitively documented how nullify/document replacement will work. At this point in the integration project, different parties may have different expectations because of the lack of documentation. However, I believe the current (inconsistent) functionality has been in effect for a long time on the test environment so I do not think it is fair to lay the blame for this schism solely with LPR.
  3. Agreed. The intention is to completely remove this inconsistency, but as stated earlier, it will require us to rework significant parts of the system and therefore that fix may not be included in our first coming hotfix.

Regarding your last remarks:

the intended behavior of point 1 seems very worrisome. [...] DXC should accept valid LPR3 data being submitted according to the published standard

The problem is that (to my knowledge) there is no published standard. Even when accepting that the ID is intrinsic to the logical object and not the registry object, I see no description of how nullify should work. One could argue, using the "binder" metaphor, that a nullify act must be respected by later addendum documents, until such a time when the binder is emptied.

The only documentation I can find regarding the semantics of nullify is found on https://scandihealth.github.io/lpr3-docs/aspects/index.html#rules-for-updating-documents:

Once an entry is nullified, it is permanently removed.

And then it is up for discussion what permanently means. Ultimately it is up to SDS to decide whether such an enhancement as you suggest should be implemented.

However, based on my current understanding of the technical limitations in LPR to change this behavior, I do not believe it will take considerably more effort to implement your suggested change. I will need time to discuss this issue internally with my team to see if we can think of any unintended consequences of such a change.

TueCN commented 5 years ago

@DaveSS I have had a chance to discuss your general expectations and remarks to bullet 1 with the team.

Regarding your point 1: The functionality you call for, (that it should be possible to report an element with the ID of a previously deleted element), would require an enhancement of the service. Such an enhancement request has already previously been raised with SDS by other parties (Region Syddanmark, I believe), and you are free raise it as well.

LPR does not support "undo deleting" a previous explicitly or implicitly nullified reporting element, regardless of the method of deletion (nullify act or document replacement). This conclusion also addresses your concerns of divergence stated in your point 2.

Regarding your point 3: There may currently be some situations where it is possible to create an element with the same ID as one that has previously been deleted, but those are unintended and technically bugs. However these bugs do not seem to have negative effects on the system and will be left in place, for now.


This solution being worked on for @Remcovanderwerf's issue will fix the errors happening when "moving" a reporting element

ThorkildFriis commented 5 years ago

We have seen a similar error, except that the error message is "Transaction Rollback" We are investigating whether the sequence of submitting is the same, or if it is covered by the cases that were mentioned earlier in this thread. We handled it for now by extending @Remcovanderwerf 's solution to also cover this error and remove the messages with this error as well.

TueCN commented 5 years ago

RESOLUTION

To assist you in testing and structuring your test plans, the following is an excerpt of internal DXC test cases that were developed as part of testing this solution. The names should be self explanatory.

TommyRegSyd commented 5 years ago

Working (in Region Syddanmark) on testing the update addressing this issue, we encountered the following (the two involved files sent to tnorgard@dxc.com):

The scenario is "deletion of an encounter, while moving a procedure from the encounter to the episode of care".

In version 1, the procedure CE68CA74-5240-472B-9287-485CE8EED345 references the encounter 7F3CC3BC-5F45-4DA2-928C-B40F4FF8729D.

Result: A reponse file with a business error is received.

In version 2, the encounter is not present but the procedure still is, and now references the containing episode of care.

Result: No reponse file is received and a soap error similar to the original happens. The error text in this case says: "...ConstraintViolationException: could not execute statement" instead of "...ConstraintViolationException: could not execute batch"

We have tested other similar cases where the result was as expected: The encounter is deleted and the procedure moved to the episode of care. But for some reason not in this case.

madswillads commented 5 years ago

We (Sundhedsplatformen) have tested the change in TEST and are experiencing these two blocking errors on all submissions:

april 1. 14:32: soap:Server>org.hibernate.PropertyAccessException: Could not set field value [Mand] value by reflection : [class com.dxc.lpr.ejb3.persist.lpr3.ForLoebsElementEntity.patientKoen] setter of com.dxc.lpr.ejb3.persist.lpr3ForloebsElementEntity.patientKoen

april 1. 23:03: A prior Document (DXC) record contact could not be marked successful because it was already marked as successful Document ID 25507 Document Version a57b6456-54c1-11e9-8ef9-a1058ff62db0

error1 error2

TueCN commented 5 years ago

Hi @madswillads

The error org.hibernate.PropertyAccessException is caused by #285 (note: it is only appearing in TEST)

The error text A prior document (DXS) record contact could not be marked successful because it was already marked as successful is coming from Sundhedsplatformen, not LPR. I don't really know what is meant by the text, and I cannot see what could case a Fatal error from your screenshot alone. Maybe it is saying that the document has already been submitted? Could you please double check that your test case to see if you are submitting the same document twice? Also please provide the input XML needed to reproduce this error.

madswillads commented 5 years ago

OK thanks The prior document..... error might come from a document versioning error we are investigating elsewhere

TommyRegSyd commented 5 years ago

Working (in Region Syddanmark) on testing the update addressing this issue, we encountered the following (the two involved files sent to tnorgard@dxc.com):

The scenario is "deletion of an encounter, while moving a procedure from the encounter to the episode of care".

In version 1, the procedure CE68CA74-5240-472B-9287-485CE8EED345 references the encounter 7F3CC3BC-5F45-4DA2-928C-B40F4FF8729D.

Result: A reponse file with a business error is received.

In version 2, the encounter is not present but the procedure still is, and now references the containing episode of care.

Result: No reponse file is received and a soap error similar to the original happens. The error text in this case says: "...ConstraintViolationException: could not execute statement" instead of "...ConstraintViolationException: could not execute batch"

We have tested other similar cases where the result was as expected: The encounter is deleted and the procedure moved to the episode of care. But for some reason not in this case.

Are there any news on this issue?

We haven't encountered the SOAP error in the test-environment for any new scenarios since the above, but we also haven't been able to test extensively.

We would like to know if a new update is coming before the current test-version is deployed to production and if there is a timeline for this.

TueCN commented 5 years ago

Hi Tommy

I have tested your scenario using the provided XML and is able to reproduce the ConstraintViolationException.

This exception occurs in in some cases during document replacement when "moving" an element that has validation issues and simultaneously deleting the element's previous parent.

Thank you for the findings! I will work on this right away. Hopefully it can be resolved quickly and a new version deployed to TEST.

TueCN commented 5 years ago

UPDATED RESOLUTION Fixed several scenarios involving "moving" an element that has validation issues and simultaneously deleting the element's previous parent are now working as intended.

However, there are still 3 (fairly complex) scenarios that each result in a SOAP fault:

A resolution for these errors has not been found yet, and reporting authorities are encouraged to use a work around: Split the single complex submission into 2 separate submissions:

  1. "move" the element
  2. delete the old parent
TueCN commented 5 years ago

Updated resolution has been deployed to TEST