Closed JimmyD closed 1 year ago
With further testing in test-environment we found that upgrading Saxon dependency to 10.6 had a considerable effect on validation time (~15 mins => ~11 mins). However, the exponential curve in validation time is still true.
It is not clear what you are comparing. Is this related to #57?
My apologies. Providing invoice line example below. I'm unsure if this is related to the issue in your response. What qualifies as a UBL extension? I'm a far cry from an expert on these matters. Appreciate any and all help on the matter.
<cac:InvoiceLine>
<cbc:ID>1</cbc:ID>
<cbc:Note>"ID":"00262194883", "Freight_Weight":"10.36", "Weight":"3.65", "Volume":"0.037", "Sending_Country":"SE", "Sending_Postcode":"70227", "Receiving_Country":"SE", "Receiving_Postcode":"81130", "Length":"0.6", "Width":"0.41", "Height":"0.15"</cbc:Note>
<cbc:InvoicedQuantity unitCode="EA">1</cbc:InvoicedQuantity>
<cbc:LineExtensionAmount currencyID="SEK">21.49</cbc:LineExtensionAmount>
<cac:InvoicePeriod>
<cbc:StartDate>2021-08-24</cbc:StartDate>
<cbc:EndDate>2021-08-24</cbc:EndDate>
</cac:InvoicePeriod>
<cac:Item>
<cbc:Name>COMPANY Parcel</cbc:Name>
<cac:SellersItemIdentification>
<cbc:ID>2007300</cbc:ID>
</cac:SellersItemIdentification>
<cac:ClassifiedTaxCategory>
<cbc:ID>S</cbc:ID>
<cbc:Percent>25</cbc:Percent>
<cac:TaxScheme>
<cbc:ID>VAT</cbc:ID>
</cac:TaxScheme>
</cac:ClassifiedTaxCategory>
</cac:Item>
<cac:Price>
<cbc:PriceAmount currencyID="SEK">21.49</cbc:PriceAmount>
</cac:Price>
</cac:InvoiceLine>
<cac:InvoiceLine>
<cbc:ID>2</cbc:ID>
<cbc:Note>"ID":"00262194883"</cbc:Note>
<cbc:InvoicedQuantity unitCode="EA">1</cbc:InvoicedQuantity>
<cbc:LineExtensionAmount currencyID="SEK">0</cbc:LineExtensionAmount>
<cac:InvoicePeriod>
<cbc:StartDate>2021-08-24</cbc:StartDate>
<cbc:EndDate>2021-08-24</cbc:EndDate>
</cac:InvoicePeriod>
<cac:Item>
<cbc:Name>Utökad hantering 14 dagar</cbc:Name>
<cac:SellersItemIdentification>
<cbc:ID>2008801</cbc:ID>
</cac:SellersItemIdentification>
<cac:ClassifiedTaxCategory>
<cbc:ID>S</cbc:ID>
<cbc:Percent>25</cbc:Percent>
<cac:TaxScheme>
<cbc:ID>VAT</cbc:ID>
</cac:TaxScheme>
</cac:ClassifiedTaxCategory>
</cac:Item>
<cac:Price>
<cbc:PriceAmount currencyID="SEK">0</cbc:PriceAmount>
</cac:Price>
</cac:InvoiceLine>
<cac:InvoiceLine>
<cbc:ID>3</cbc:ID>
<cbc:Note>"ID":"00262194883"</cbc:Note>
<cbc:InvoicedQuantity unitCode="EA">1</cbc:InvoicedQuantity>
<cbc:LineExtensionAmount currencyID="SEK">0</cbc:LineExtensionAmount>
<cac:InvoicePeriod>
<cbc:StartDate>2021-08-24</cbc:StartDate>
<cbc:EndDate>2021-08-24</cbc:EndDate>
</cac:InvoicePeriod>
<cac:Item>
<cbc:Name>Tidsbestämd leverans</cbc:Name>
<cac:SellersItemIdentification>
<cbc:ID>2003929</cbc:ID>
</cac:SellersItemIdentification>
<cac:ClassifiedTaxCategory>
<cbc:ID>S</cbc:ID>
<cbc:Percent>25</cbc:Percent>
<cac:TaxScheme>
<cbc:ID>VAT</cbc:ID>
</cac:TaxScheme>
</cac:ClassifiedTaxCategory>
</cac:Item>
<cac:Price>
<cbc:PriceAmount currencyID="SEK">0</cbc:PriceAmount>
</cac:Price>
</cac:InvoiceLine>
@JimmyD Are your tests using master or latest release? Does it make a difference? I have previsouly made some undocumented performance tests for #57 without noticing any negative impact.
@JimmyD Are your tests using master or latest release? Does it make a difference? I have previsouly made some undocumented performance tests for #57 without noticing any negative impact.
I'm using latest release for the tests. Is there a planned release anytime soon?
Please make sure these boxes are checked before submitting your issue - thank you!
Your issue here:
We have an implementation of the VEFA Validator in our java application (openjdk 8). Here's our issue: The validation process takes considerably longer time to finish for files upwards of 20Mb and larger, having observed waiting times increasing exponentially with the amount of invoice lines (PEPPOL business document). A document the size of 80MB took approx 15 minutes to validate. In comparison, a file the size of 11 MB took 7 seconds. The VEFA Validator has a dependency to Saxonica Saxon XML parser where the transformation is done. However, running Saxonicas standalone parser with profiler (with the appropriate xsl schema) takes less than a minute to complete.
Has this been observed (is this a known limitation) by the good people at Anskaffelser VEFA/Difi? Is it possible to configure the Validator to handle files of this size better? Is VEFA Validator really the culprit or am I barking up the wrong tree?
Tests
Specs: CPU: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz RAM: 16 GiB OS: Ubuntu 21.04 64-bit
Running one instance of IntelliJ.
(m×60000+s×1000)÷l (Memory usage: ~60%) 10,198 l => 12s => 1,176701314 ms 10,216 l => 14s => 1,370399374 ms 15,043 l => 38s => 2,52609187 ms 20,400 l => 1m 13s => 3,578431373 ms 28,450 l => 1m 53s => 4,217926186 ms 28,450 l => 2m 6s => 4,428822496 ms 39,000 l => 4m 52s => 7,487179487 ms 54,000 l => 8m 40s => 9,62962963 ms 65,000 l => 11m 38s => 10,738461538 ms 79,691 l => 14m 40s => 11,042652244 ms 79,691 l => 15m 24s => 11,594784857 ms
Actions that have been taken - Result