hobuinc / silvimetric

Apache License 2.0
8 stars 4 forks source link

Metrics using Classification always output zero #102

Open rrowlands opened 1 month ago

rrowlands commented 1 month ago

I know this sounds weird, but even if the source dataset has a value for classification, the metric always outputs zero for all points. This holds for all metrics, even built-in one such as max / mean / etc.

I managed to work around it by creating a PDAL pipeline as:

{
  "pipeline": [
    {
      "type": "readers.copc",
      "filename": "pointcloud.copc.laz"
    },
    {
      "type": "filters.ferry",
      "dimensions": "Classification=>UserData"
    },
    {
      "type": "writers.copc",
      "filename": "pointcloud-ferry.copc.laz"
    }
  ]
}

Which simply copies the Classification attribute over to UserData. With this in place, running silvimetric against the UserData attribute causes the expected behaviour.

Found to be the case on v1.1.1.

rrowlands commented 1 month ago

Here's a minimal reproducer:

#!/bin/bash

set -e

rm -rf database.tdb
rm -rf out
mkdir out

wget https://tftest222.s3.us-west-2.amazonaws.com/pointcloud.copc.laz -O pointcloud.copc.laz

bounds=$(pdal info pointcloud.copc.laz --readers.copc.resolution=1 | jq -c '.stats.bbox.native.bbox')

crs=$(pdal info --metadata pointcloud.copc.laz --readers.copc.resolution=10 | jq -c '.metadata.srs.json.components[0].id.code')

# CRS fallback
if [ -z "$crs" ] || [ "$crs" = "null" ]; then
    crs=$(pdal info --metadata pointcloud.copc.laz --readers.copc.resolution=10 | jq -c '.metadata.srs.json.id.code')
fi

silvimetric --database database.tdb     initialize     --bounds $bounds     --crs $crs -a "Z" -a "Classification"

silvimetric -d database.tdb --threads 2 --workers 1  shatter --date 2008-12-01 pointcloud.copc.laz

silvimetric -d database.tdb extract -o out

You'll notice that pdal info, when run on this pointcloud, returns that the Classification attribute ranges from 0 to 6, with an average of 2. The 'max' metric output file on Classification returns zero for every point.

kylemann16 commented 1 month ago

I believe this was fixed in either 1.1.2 or 1.1.3. The problem was a mismatch in the data size between PDAL and TileDB, so TileDB was using the incorrect stride when traversing byte arrays and ended up with a bunch of junk. If I remember correctly you'll find the same bug for anything sized uint8 or int8.

I'm going to double check this, but I remember fixing something similar to this a while ago.

rrowlands commented 1 month ago

Ahh interesting. So then it sounds like the only reason my ferry pipeline hack works is because UserData is a character attribute.