owlcs / owlapi

OWL API main repository
821 stars 315 forks source link

Non-deterministic sorting of triple objects when serializing to TTL #1061

Closed netfl0 closed 2 years ago

netfl0 commented 2 years ago

When there are multiple of the same predicates on a subject, owlapi appears to non-determinstically sort (or in some cases combine) the objects. This introduces spurious changes to a TTL file which makes diffs hard to read.

I noticed this when saving one small change in our ontology with protege. In that diff, the only lines that should have changed are 19452-19453.

I assume this could be either owlapi or the way it is invoked by Protege (desktop). If there are any workarounds too we'd be grateful.

Thank you for these excellent projects.

hack-sentinel commented 2 years ago

We've added ttlser to our workflow for the D3FEND project and now use that tool to get stable serialization for diff comparisons.

ignazio1977 commented 2 years ago

Do you know how the file was created before being edited with Protege? I.e., same protege version, another protege version, or another tool?

We've done a lot of work to make the order of outputted items consistent, but Protege is a bit behind on OWLAPI versions and that work spanned a number of versions, so there might be issues due some bug fixes being included and others being in the future, from Protege's view. Also, a file that wasn't ordered according to the same criteria (for example, created with other tools) would be changed a lot in the first manipulation through Protege. Unfortunately that last problem cannot be avoided - it would require rewriting all parsers to be able to also keep track of where axioms appeared in the original input.

netfl0 commented 2 years ago

Yes it was the latest version, created and saved with Protege, 5.5.0 .

ignazio1977 commented 2 years ago

I checked the d3fend file and I could replicate the issue with OWLAPI 4.5.9, which is the one included in Protege. With OWLAPI 4.5.20 the random output stops - there was a bug fix about ordering literals included, backporting the sorting method implemented in OWLAPI 5. However, testing that, I found a missed bug fix for #640, which had never been fixed in version 4.

Next release will sort both problems; you'll be able to update by dropping the OWLAPI osgidistribution artifact in your Protege bundles folder, or you could build protege locally and update the dependency.

Unfortunately Protege is lagging behind the latest OWLAPI builds a bit. I don't know their plans for an update.

ignazio1977 commented 2 years ago

4.5.22 released, FYI

netfl0 commented 2 years ago

Thank you for your excellent responses.