kbss-cvut / termit

An advanced SKOS terminology manager linking concepts to their definitions in documents
GNU General Public License v3.0
8 stars 9 forks source link

Performance of vocabulary validation #287

Closed ledsoft closed 1 month ago

ledsoft commented 2 months ago

When validating a moderately-sized vocabulary (ca 400 terms) with flat structure validation often times out on the proxy after 60 seconds. This is unacceptable.

ledsoft commented 2 months ago

Update: the validated vocabulary imports another vocabulary containing ca 660 terms, so the total number of terms validated is around 1000.

ledsoft commented 2 months ago

TTL export of the relevant vocabularies (which is what the Validator uses - it exports the vocabulary contexts into a byte stream and imports it to a Jena in-memory model) has approximately 1.4MB.

ledsoft commented 2 months ago

Did a quick comparison with termit-dev, which does not run in Docker and is thus able to utilize all the CPUs on the kbss server. It was able to validate a vocabulary with 1800 terms in 22 seconds (note that GraphDB it uses is on a different server). Will need to try testing it locally in Docker so that CPU availability can be tuned for the test.