claeis / ili2db

interlis import/export to relational databases
30 stars 30 forks source link

Wrong interpretation of bigint #464

Closed TI-CCGEO closed 2 years ago

TI-CCGEO commented 2 years ago

When loading an xtf into the schema with a bigint attribute #348, the data is not loaded correctly presenting some negative numbers despite the fact that id_incendio [0..99999999] is not possible in the template. Also tested ili2gpkg with identical problem. I enclose model, xtf and command ili2pg4-7.0 used

Example.zip

beistehen commented 2 years ago

I did some tests using ili2gpkg to generate a schema and add data to it.

a) Using an INTERLIS attribute definition of 0 .. 9999999999 (10 figures) I get the following results in gpkg (apostrophes inserted for better readability): attrtibute value in xtf attribute value in gpkg result
2'147'483'646 2'147'483'646 ok
2'147'483'647 2'147'483'647 ok
2'147'483'648 -2'147'483'648 overflow
2'147'483'649 -2'147'483'647 overflow

b) Fact: a Java int data type can handle values vom -2'147'483'648 to 2'147'483'647

c) The column data type created in gpkg is defined as INTEGER. A value like 3'000'000'000 can be inserted (using SQL INSERT statment) without problems, as SQLite handles integer data types dynamically.

d) The INTERLIS Reference Manual does not address the issue of numeric data types as INTERLIS is kind of 'typeless'.

Conclusion: In ili2gpkg, while processing the (string) value read from xtf, there must be a conversion issue where a numeric value (which is initially stored in a double data type) is converted to an int data type before it is inserted to a database like gpkg or pg. Next step will be to trace down the right spot where this conversion happens in the code.

beistehen commented 2 years ago

Looks like this code is to blame: https://github.com/claeis/ili2db/blob/b2b1ac36fd5b03d78bad01992923647d429e928c/src/ch/ehi/ili2db/fromxtf/FromXtfRecordConverter.java#L1305-L1332

As I understand, the decision to convert a value to double is only based on the accuracy of the minimum value of the numeric type. So for the numeric type 0 .. 9999999999 the minimum value is 0 and the accuracy is 0. In line 1311 the expression's result is false, which leads to a conversion to int in line 1324 and to the observed overflow (negative number).

How to fix it I'm not clear about that yet. Somehow the figures of the minimum and maximum value of the data type have to be counted or an additional test for negative numbers after conversion to an int data type has to be implemented. Real programers to the rescue @claeis @Philippluca ! 😄

Workaround until this bug is fixed: In your INTERLIS model instead of using id_incendio : MANDATORY 0 .. 9999999999999;

make the accuracy > 0 by using the following notation: id_incendio : MANDATORY 0.0 .. 9999999999999.9;

TI-CCGEO commented 2 years ago

As a workaround, for now we've fixed it using a text attribute.

id_incendio : MANDATORY TEXT*20;