stardog-union / pellet

Pellet is an OWL 2 reasoner in Java; open source (AGPL) and commercially licensed, commercial support available.
http://clarkparsia.com/pellet
Other
302 stars 153 forks source link

Numeric Java type not always matching XSD type when creating a Literal #11

Closed tobias-hammerschmidt closed 9 years ago

tobias-hammerschmidt commented 9 years ago

When a Literal is instantiated using the constructor org.mindswap.pellet.Literal.Literal(ATermAppl, ATermAppl, ABox, DependencySet) it will intialize its value by calling com.clarkparsia.pellet.datatypes.DatatypeReasoner.getValue(ATermAppl) on the datatype reasoner. The reasoner will just delegate the call to com.clarkparsia.pellet.datatypes.Datatype.getValue(ATermAppl). The numeric datatypes (like for instance com.clarkparsia.pellet.datatypes.XSDDecimal) will parse the literal value and pass the result to com.clarkparsia.pellet.datatypes.OWLRealUtils.getCanonicalObject(Number) for simplification. However the returned value might not have a Java type matching the XSD type (at least as defined in http://en.wikipedia.org/wiki/Java_Architecture_for_XML_Binding). For instance its perfectly possible to create a literal with the XXSD type decimal and a value of 35000 where the mentioned getCanonicalObject method will return an Integer whileas a BigDecimal would be expected.

@evren is this the intended behavior? Looking at https://github.com/clarkparsia/pellet/blob/master/core/src/main/java/com/clarkparsia/pellet/datatypes/OWLRealUtils.java#L266 I get the impression that some refactoring is/was planned here?

evren commented 9 years ago

This is the expected behavior. From a reasoning perspective 1^^xsd:decimal, 1^^xsd:integer, 1^^xsd:int, etc. are all equivalent. In order to minimize memory usage Pellet always picks the smallest Number instance that can represent the value correctly which is exactly what is happening here.

The comment you refer to is just for improving this process because as you can see in the code in some cases we first create a BigInteger instance before realizing a more compact representation is adequate. Since BigInteger/BigDecimal instantiation is costly compared to other numeric types we'd like to avoid them as much as possible.