pgpointcloud / pointcloud

A PostgreSQL extension for storing point cloud (LIDAR) data.
https://pgpointcloud.github.io/pointcloud/
Other
388 stars 107 forks source link

Pc_FilterEqual: WARNING: Value 3.17414e+09 truncated to 2.14748e+09 to fit in int32 #78

Closed strk closed 9 years ago

strk commented 9 years ago

I'm getting a truncation warning from Pc_FilterEqual. That's unexpected as PC_FilterEqual should not be changing values but only extracting them, and the value would be truncated on first input, not during extraction...

The offending patch contains 600 points. I'm trying to reduce it. I'm trying to reduce the testcase.

strk commented 9 years ago

Here, I can do it with a single-point patch:

 {"pcid":1,"pts":[[1,4,1,4,11,469792,1.6475e+06,4.86241e+06,264.92]]}

Schema is:

<?xml version="1.0" encoding="UTF-8"?>
<pc:PointCloudSchema xmlns:pc="http://pointcloud.org/schemas/PC/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <pc:dimension>
  <pc:position>1</pc:position>
  <pc:size>1</pc:size>
  <pc:description>Pulse return number for a given output pulse. A given output laser pulse can have many returns, and they must be marked in order,
 starting with 1</pc:description>
  <pc:name>ReturnNumber</pc:name>
  <pc:interpretation>uint8_t</pc:interpretation>
  <pc:active>true</pc:active>
 </pc:dimension>
 <pc:dimension>
  <pc:position>2</pc:position>
  <pc:size>1</pc:size>
  <pc:description>Total number of returns for a given pulse.</pc:description>
  <pc:name>NumberOfReturns</pc:name>
  <pc:interpretation>uint8_t</pc:interpretation>
  <pc:active>true</pc:active>
 </pc:dimension>
 <pc:dimension>
  <pc:position>3</pc:position>
  <pc:size>1</pc:size>
  <pc:description>ASPRS classification.  0 for no classification.  See LAS specification for details</pc:description>
  <pc:name>Classification</pc:name>
  <pc:interpretation>uint8_t</pc:interpretation>
  <pc:active>true</pc:active>
 </pc:dimension>
 <pc:dimension>
  <pc:position>4</pc:position>
  <pc:size>1</pc:size>
  <pc:description>Angle degree at which the laster point was output from the system, including the roll of the aircraft.  The scan angle is based on being nadir, and -90 the left side of the aircraft in the direction of flight</pc:description>
  <pc:name>ScanAngleRank</pc:name>
  <pc:interpretation>int8_t</pc:interpretation>
  <pc:active>true</pc:active>
 </pc:dimension>
 <pc:dimension>
  <pc:position>5</pc:position>
  <pc:size>2</pc:size>
  <pc:description>File source ID from which the point originated.  Zero indicates that the point originated in the current file</pc:description>
  <pc:name>PointSourceId</pc:name>
  <pc:interpretation>uint16_t</pc:interpretation>
  <pc:active>true</pc:active>
 </pc:dimension>
 <pc:dimension>
  <pc:position>6</pc:position>
  <pc:size>8</pc:size>
  <pc:description>GPS time that the point was acquired</pc:description>
  <pc:name>GpsTime</pc:name>
  <pc:interpretation>double</pc:interpretation>
  <pc:active>true</pc:active>
 </pc:dimension>
 <pc:dimension>
  <pc:position>7</pc:position>
  <pc:size>4</pc:size>
  <pc:description>X coordinate</pc:description>
  <pc:scale>0.01</pc:scale>
  <pc:offset>1500000</pc:offset>
  <pc:name>X</pc:name>
  <pc:interpretation>int32_t</pc:interpretation>
  <pc:active>true</pc:active>
 </pc:dimension>
 <pc:dimension>
  <pc:position>8</pc:position>
  <pc:size>4</pc:size>
  <pc:description>Y coordinate</pc:description>
  <pc:scale>0.01</pc:scale>
  <pc:offset>4500000</pc:offset>
  <pc:name>Y</pc:name>
  <pc:interpretation>int32_t</pc:interpretation>
  <pc:active>true</pc:active>
 </pc:dimension>
 <pc:dimension>
  <pc:position>9</pc:position>
  <pc:size>4</pc:size>
  <pc:description>Z coordinate</pc:description>
  <pc:scale>0.01</pc:scale>
  <pc:offset>0</pc:offset>
  <pc:name>Z</pc:name>
  <pc:interpretation>int32_t</pc:interpretation>
  <pc:active>true</pc:active>
 </pc:dimension>
 <pc:metadata>
<Metadata name="compression" type="string">dimensional</Metadata></pc:metadata>
 <pc:orientation>point</pc:orientation>
</pc:PointCloudSchema>

Binary version:

01010000000200000001000000030400000078DA6304030400000078DA6301030400000078DA6304030400000078DA6301030800000078DAE36600000018031000000078DACBBEF1DEB3618D8C230014CC0405030C00000078DABB20F0900100053601C2030C00000078DA3BFE5F830900066F01F1030C00000078DAAB4967600000032900E4
strk commented 9 years ago

Full contained test (still requires schema):

rt=# select pc_numpoints(PC_FilterEquals('01010000000200000001000000030400000078DA6304030400000078DA6301030400000078DA6304030400000078DA6301030800000078DAE36600000018031000000078DACBBEF1DEB3618D8C230014CC0405030C00000078DABB20F0900100053601C2030C00000078DA3BFE5F830900066F01F1030C00000078DAAB4967600000032900E4','z','264.92'));
WARNING:  Value 3.17414e+09 truncated to 2.14748e+09 to fit in int32
WARNING:  Value 3.17414e+09 truncated to 2.14748e+09 to fit in int32
WARNING:  Value 3.17414e+09 truncated to 2.14748e+09 to fit in int32
pc_numpoints
1
(1 row)

Interesting to note that we get 3 warnings for a single point

strk commented 9 years ago

The error is triggered by the stats set phase:

NOTICE:  setting stats for dimension 7
WARNING:  Value 3.17414e+09 truncated to 2.14748e+09 to fit in int32
WARNING:  Value 3.17414e+09 truncated to 2.14748e+09 to fit in int32
WARNING:  Value 3.17414e+09 truncated to 2.14748e+09 to fit in int32
NOTICE:  stats for dimension 7 set
strk commented 9 years ago

I confirm PC_MakePoint does not warn, while PC_FilterEquals does:

with inp as ( 
 select pc_patch(pc_makepoint(1,
'{1,4,1,4,11,469792.072204,1647499.04,4862413.51,264.92}'
)) pa ) 
select pc_numpoints(pa) from inp;

The 4862413.51 value, offsetted and scaled does indeed fall outside of the valid range for an int32. What's not clear is why PC_MakePoint does not warn and how can the value be interpreted as such a big number on reading (it should have been truncated before).

strk commented 9 years ago

On PC_MakePoint:

NOTICE:  setting double 4862413.51 to dim Y
NOTICE:  unscaled/unoffsetted value became 36241351

On PC_FilterEquals:

NOTICE:  setting double 36241351 to dim Y
NOTICE:  unscaled/unoffsetted value became 3174135100

It's like reading the Y for stats fails to scale and offset. Indeed while the values are correct after PC_FilterEquals, the stats are completely off!

strk commented 9 years ago

implementing #77 might help with testing this

strk commented 9 years ago

The summary of the patch:

 {"pcid":1, "npts":1, "srid":3003, "compr":"dimensional","dims":
[
{"pos":0,"name":"ReturnNumber","size":1,"type":"uint8_t","compr":"zlib",
  "stats":{"min":1,"max":1,"avg":1}},
{"pos":1,"name":"NumberOfReturns","size":1,"type":"uint8_t","compr":"zlib",
  "stats":{"min":4,"max":4,"avg":4}},
{"pos":2,"name":"Classification","size":1,"type":"uint8_t","compr":"zlib",
  "stats":{"min":1,"max":1,"avg":1}},
{"pos":3,"name":"ScanAngleRank","size":1,"type":"int8_t","compr":"zlib",
  "stats":{"min":4,"max":4,"avg":4}},
{"pos":4,"name":"PointSourceId","size":2,"type":"uint16_t","compr":"zlib",
  "stats":{"min":11,"max":11,"avg":11}},
{"pos":5,"name":"GpsTime","size":8,"type":"double","compr":"zlib",
  "stats":{"min":469792,"max":469792,"avg":469792}},
{"pos":6,"name":"X","size":4,"type":"int32_t","compr":"zlib",
  "stats":{"min":1.6475e+06,"max":1.6475e+06,"avg":1.6475e+06}},
{"pos":7,"name":"Y","size":4,"type":"int32_t","compr":"zlib",
  "stats":{"min":4.86241e+06,"max":4.86241e+06,"avg":4.86241e+06}},
{"pos":8,"name":"Z","size":4,"type":"int32_t","compr":"zlib",
  "stats":{"min":264.92,"max":264.92,"avg":264.92}}
]}
strk commented 9 years ago

Dragons here, after further simplifying the query to:

with inp as ( 
 select pc_patch(pc_makepoint(1,'{0,0,0,0,0,0,1647499,4862413,0}')) pa 
) select pc_patchmax(pc_filterequals(pa,'z',0),'y') from inp;

And adding debugging lines:

NOTICE:  stats initialized for dimension 0: min:3.40282e+38, max:1.17549e-38, avg:0
NOTICE:  filtering bytes for dimension 0
NOTICE:   0 < 3.40282e+38
NOTICE:   NOT 0 > 1.17549e-38
NOTICE:  stats after double_from_ptr gave 0 (intrpr:2): min:0, max:1.17549e-38

both 0 and 1.17549e-38 should be doubles

strk commented 9 years ago

The code going with the debugging line above is:

                d = pc_double_from_ptr(buf, interp);                            
                if ( d < stats->min ) { pcinfo(" %g < %g", d, stats->min); stats->min = d; }
                else { pcinfo(" NOT %g < %g", d, stats->min); }                 
                if ( d > stats->max ) { pcinfo(" %g > %g", d, stats->max); stats->max = d; }
                else { pcinfo(" NOT %g > %g", d, stats->max); }                 
pcinfo("stats after double_from_ptr gave %g (intrpr:%d): min:%g, max:%g", d, interp, stats->min, stats->max);
strk commented 9 years ago

I suspect a problem with use of FLT_MAX/FLT_MIN ratehr than DBL_MAX/DBL_MIN

strk commented 9 years ago

got it, FLT_MIN should really be -FLT_MAX instead...

strk commented 9 years ago

so, next stop: automated testcase, then fix (FLT_MIN is the smallest absolute number, not negative one)

strk commented 9 years ago

... note: the FLT_MIN bug is not the cause of this bug, but another one, I'll file separately.

strk commented 9 years ago

So back here. The problem is with dimensionally-compressed patches where dimensions are scaled/offsetted. Uncompressing the patch before the PC_Filter* call fixes the issue. Test:

strk=# SELECT '#78' issue,
  PC_PatchMin(p,'x') x_min, PC_PatchMax(p,'x') x_max,
  PC_PatchMin(p,'y') y_min, PC_PatchMax(p,'y') y_max,
  PC_PatchMin(p,'z') z_min, PC_PatchMax(p,'z') z_max
FROM ( SELECT
  PC_FilterEquals(
    PC_Patch( PC_MakePoint(1,ARRAY[-1,0,1,2]) ),
    'z',1) p
) foo;
 issue | x_min | x_max | y_min | y_max | z_min | z_max 
-------+-------+-------+-------+-------+-------+-------
 #78   |  -100 |  -100 |     0 |     0 |   100 |   100
(1 row)