Closed Michiel-s closed 3 years ago
In addition to my post above: the reason why we are having this issue is because we have users that submit very large xml files for validation that can take up very long. Some files are >200MB (uncompressed) and can contain up to 100k of items (e.g. product catalogue items) that need to be validated with 10+ rules.
Hi @Michiel-s I added a quick test (see the previous commit) and my output looks like this, which seems to be okay:
5 [main] INFO com.helger.schematron.supplementary.Issue119Test - Spawn task
35 [main] INFO com.helger.schematron.supplementary.Issue119Test - Wait
35 [pool-1-thread-1] INFO com.helger.schematron.supplementary.Issue119Test - Loading Schematron
65 [pool-1-thread-1] INFO com.helger.schematron.supplementary.Issue119Test - Applying Schematron
1048 [main] INFO com.helger.schematron.supplementary.Issue119Test - Cancel
1048 [main] INFO com.helger.schematron.supplementary.Issue119Test - Done
What Schematron implementation are you using???
I haven't run your test yet, but what we see is that the thread from the pool is not aborted. The main tread indeed reports that it executed the future.cancel
correctly, which send an interrupt signal to the thread. But the validation process running in that thread isn't stopped.
We use com.helger.schematron.xslt.SchematronResourceSCH
and are still on v5.6. Maybe we should first migrate to V6.x and see if that solves the issue already.
Ah okay. You are saying because I quit my application, the other thread is stopped by accident, but when I would be waiting in the main thread I would see, that the other thread still runs. Is that what you're saying?
Ah okay. You are saying because I quit my application, the other thread is stopped by accident, but when I would be waiting in the main thread I would see, that the other thread still runs. Is that what you're saying?
I think so yes.
Okay yes, I can confirm that:
7 [main] INFO com.helger.schematron.supplementary.Issue119Test - Spawn task
38 [main] INFO com.helger.schematron.supplementary.Issue119Test - Wait
38 [pool-1-thread-1] INFO com.helger.schematron.supplementary.Issue119Test - Loading Schematron
68 [pool-1-thread-1] INFO com.helger.schematron.supplementary.Issue119Test - Applying Schematron
1056 [main] INFO com.helger.schematron.supplementary.Issue119Test - Cancel
2298 [pool-1-thread-1] INFO com.helger.schematron.AbstractSchematronResource - Read XML resource [cpPath=/issues/github119/ubl-tc434-example1.xml; urlResolved=true; URL=file:/C:/dev/git/ph-schematron/ph-schematron-xslt/target/test-classes/issues/github119/ubl-tc434-example1.xml]
2357 [pool-1-thread-1] INFO com.helger.schematron.supplementary.Issue119Test - Finished applying Schematron
4060 [main] INFO com.helger.schematron.supplementary.Issue119Test - Done
Effectively that would mean, that more or less all the APIs would need to declare throws InterruptedException
which would be a heavy change and would effect all the regular use cases.
What do you think about an UncheckedInterruptedException
to be less intrusive?
I pushed a 6.2.4-SNAPSHOT
version to Maven Central Snapshots. It would be great if you could verify if it works for you as well. The output of the Issue119Test
is now:
5 [main] INFO com.helger.schematron.supplementary.Issue119Test - Spawn task
35 [main] INFO com.helger.schematron.supplementary.Issue119Test - Wait
35 [pool-1-thread-1] INFO com.helger.schematron.supplementary.Issue119Test - Loading Schematron
64 [pool-1-thread-1] INFO com.helger.schematron.supplementary.Issue119Test - Applying Schematron
1074 [main] INFO com.helger.schematron.supplementary.Issue119Test - Cancel
1792 [pool-1-thread-1] INFO com.helger.schematron.supplementary.Issue119Test - Applying Schematron was successfully interrupted - Interrupted Schematron compilation: after XSLT step 1
4084 [main] INFO com.helger.schematron.supplementary.Issue119Test - Done
We'll do that. I'll let you know
@Michiel-s so how does it look like with your tests? If it is to your needs, I will release 6.2.4 asap
Hi @phax, my colleague implemented and tested the snapshot last Friday and reported back to me that our unit tests indicate that the timeout works. Not sure if that includes the thread killig. He just went on a 2 week holiday break so I can't verify at this moment. But based on his comments, and also on your own observations, I would say let's release. If that is ok with you.
Thx very much for the fix. We really appreciate it. Please let me know if we can return any favor.
@Michiel-s Thanks for the swift response - I will just build a 6.2.4 release build, and if you need interruptions on additional places - let me know. There will be a 6.2.5 or 6.3.0 anyway ;-)
Hi @phax,
Is there a way to cancel or abort a validation? We've created a wrapper application around ph-schematron that can receive validation requests via a REST API.
To run the actual validation steps we execute a Callable task by a CachedThreadPool ExecutorService. We then try to get the response and wait for max 30 sec before returning a server timeout exception. However the task doesn't seem to act upon a Interruption signal. Is this something you could add to the ph-schematron code, to check for
Thread.currentThread().isInterrupted()
at regular intervals? Or do you see any other solution that we could implement?We're currently using v5.6.0 and are preparing to move to v6.x
Relevant parts of our code:
The
future.cancel(true);
in the finally block sends the interrupt signal.I've found this article probably explaining why executed task is not stopped: https://codeahoy.com/java/Cancel-Tasks-In-Executors-Threads/
Kind regards, Michiel