PDAL / java

Java extension and bindings for PDAL
https://pdal.io/java.html
Other
8 stars 10 forks source link

Performance issues with ReaderLas #69

Closed MBunel closed 1 year ago

MBunel commented 1 year ago

Hi,

I'm trying to use pdal's java bindings to read las and laz point clouds in Scala (as shown in the documentation), and I have some slowness problems. When I run the method execute (see below) my code takes a long time to run, as if calling the execute method loaded all the points in memory.

The main problem is that it's impossible to access the header content before calling the execute method, but I'd like to use the header content to filter the files I actually want to read.

I haven't found a solution to this problem in the java biding. They don't seem to expose a solution for easily reading a las file, for example with a pointer on files, as proposed in a lib like las_rs. However, I haven't looked at the C++ api, so I don't know whether this is a limitation of the java bindings, or a pdal design choice.

So I have two questions:

  1. Can I read a las file in Scala more effectively ?
  2. Can I access the header without running a pipeline?

Thanks


This is the current version of my code.

class LASPdalReader(path: String) extends PartitionReader[InternalRow] {

  private val expression = ReadLas(path)
  private val pipeline = expression.toPipeline
  pipeline.initialize()
  // This step if very long
  pipeline.execute()

  private val pvs: PointViewIterator = pipeline.getPointViews()
  private val pv = pvs.next()

  private val points_count = pv.length()
  private var counter = 0

  override def next(): Boolean = this.counter < this.points_count

  override def get(): InternalRow = {
    val row = InternalRow(
      pv.getX(this.counter).toFloat,
      pv.getY(this.counter).toFloat,
      pv.getY(this.counter).toFloat,
      pv.getShort(this.counter, "Classification")
    )
    this.counter += 1
    row
  }

  override def close(): Unit = pvs.close()
}
hobu commented 1 year ago

Does the Java bindings have the preview() method for pdal::Stage? This is what you want...

pomadchin commented 1 year ago

Nope, its not exposed.

I guess we need to implement smth similar to Pythons getQuickInfo

pomadchin commented 1 year ago

There's also https://github.com/geotrellis/geotrellis-pointcloud (which is slightly outdated), but may be of help (since I see some Spark code)! It does not implement DataSourcesV2 API sadly (yet, some attention and time needed for the project).

pomadchin commented 1 year ago

Hey @MBunel see https://github.com/PDAL/java/pull/70 with the quickInfo exposed.

Also most likely for the needs in the question the metadata could suffice!

MBunel commented 1 year ago

Many thanks for this implementation.