PDAL / java

Java extension and bindings for PDAL
https://pdal.io/java.html
Other
8 stars 10 forks source link

EPT Pipeline fails at execute step after validation #84

Open freddyAtPacify opened 8 months ago

freddyAtPacify commented 8 months ago

The following pipeline fails in native code during the call to execute. But the same pipeline works when called using pdal tool from command line.

Pipeline:

    {
        "type": "readers.stac",
        "filename": "https://usgs-lidar-stac.s3-us-west-2.amazonaws.com/ept/item_collection.json",
        "reader_args": [
            {
                "type": "readers.ept",
                "bounds": "([-71.1356958302804, -71.1354910531836], [42.4274301810906, 42.4277093379612], [0, 30])/EPSG:4326",
                "threads": "3"
            }
        ],
        "item_ids": ["MA_CentralEastern_1_2021", "MA_CentralEastern_2_2021"],
        "properties": {
            "pc:type": ["lidar","sonar"],
            "pc:encoding": "ept"
        },
        "asset_names": ["ept.json"]
    },
    {
        "type": "filters.range",
        "limits": "Classification[0:15]"
    },
    {
        "type": "writers.text",
        "format": "csv",
        "order": "X,Y,Z",
        "keep_unspecified": "false",
        "filename": "output.csv"
    } }

The result in running it from Java is the following failure in native code:

Problematic frame:

C  [libpdalcpp.16.3.0.dylib+0x2dafde]  pdal::connector::Connector::get(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) const+0x1e

Here is the stack call:

C  [libpdalcpp.16.3.0.dylib+0x2dafde]  pdal::connector::Connector::get(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) const+0x1e
C  [libpdalcpp.16.3.0.dylib+0x2db165]  pdal::connector::Connector::getJson(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) const+0x2f
C  [libpdalcpp.16.3.0.dylib+0x161e76]  pdal::EptReader::overlaps()+0x154
C  [libpdalcpp.16.3.0.dylib+0x162777]  pdal::EptReader::ready(pdal::BasePointTable&)+0x1d9
C  [libpdalcpp.16.3.0.dylib+0x291933]  pdal::Stage::execute(pdal::BasePointTable&, std::__1::set<std::__1::shared_ptr<pdal::PointView>, pdal::PointViewLess, std::__1::allocator<std::__1::shared_ptr<pdal::PointView> > >&)+0x1c1
C  [libpdalcpp.16.3.0.dylib+0x290dae]  pdal::Stage::execute(pdal::BasePointTable&)+0x2ac
C  [libpdalcpp.16.3.0.dylib+0x26fde4]  pdal::PipelineManager::execute(pdal::ExecMode)+0x9c
C  [libpdalcpp.16.3.0.dylib+0x26ff31]  pdal::PipelineManager::execute()+0xb
C  [libpdaljni.2.6.dylib+0x4e22]  libpdaljava::PipelineExecutor::execute()+0x12
C  [libpdaljni.2.6.dylib+0xafa0]  Java_io_pdal_Pipeline_execute+0x40
j  io.pdal.Pipeline.execute()V+0

I am using the following libraries:

pda_2.13-2.6.1 pdal-native-2.6.1 pdal-scala_2.13-2.6.1

The Java code intializes the pipeline and ensure that the pipeline .validate() returns true.

The same pipeline works when called from the command line as follows: pdal pipeline --input inputFile.json Where inputFile.json contains the pipeline above

pomadchin commented 8 months ago

Hi @freddyAtPacify 🤔 will definitely take a look; wondering what's wrong with it.

hobu commented 8 months ago

I assume something is mangling the JSON in reader_args that's getting passed down to readers.ept. You might need to escape that JSON

pomadchin commented 8 months ago

@freddyAtPacify I think @hobu is 💯 correct! Thanks Howard! 💸

Check the JSON string that is passed down into the Pipeline.

I tried the following code and it seems to cause no issues:

import io.pdal.Pipeline

object Main {
  val json =
    """
      |{
      |  "pipeline": [
      |    {
      |      "type": "readers.stac",
      |      "filename": "https://usgs-lidar-stac.s3-us-west-2.amazonaws.com/ept/item_collection.json",
      |      "reader_args": [
      |        {
      |          "type": "readers.ept",
      |          "bounds": "([-71.1356958302804, -71.1354910531836], [42.4274301810906, 42.4277093379612], [0, 30])/EPSG:4326",
      |          "threads": "3"
      |        }
      |      ],
      |      "item_ids": [
      |        "MA_CentralEastern_1_2021",
      |        "MA_CentralEastern_2_2021"
      |      ],
      |      "properties": {
      |        "pc:type": [
      |          "lidar",
      |          "sonar"
      |        ],
      |        "pc:encoding": "ept"
      |      },
      |      "asset_names": [
      |        "ept.json"
      |      ]
      |    },
      |    {
      |      "type": "filters.range",
      |      "limits": "Classification[0:15]"
      |    },
      |    {
      |      "type": "writers.text",
      |      "format": "csv",
      |      "order": "X,Y,Z",
      |      "keep_unspecified": "false",
      |      "filename": "output.csv"
      |    }
      |  ]
      |}
    """.stripMargin

  def main(args: Array[String]) = {
    val pipeline = Pipeline(json)
    pipeline.execute()
    println(s"pipeline.getPointViews().next().length(): ${pipeline.getPointViews().next().length()}")
    println(s"pipeline.getMetadata(): ${pipeline.getMetadata()}")
    pipeline.close()
  }
}

outputs:

[info] running io.pdal.Main
pipeline.getPointViews().next().length(): 11984
pipeline.getMetadata(): {
  "metadata":
  {
    "filters.range":
    {
    },
    "readers.stac":
    {
    },
    "writers.text":
    {
      "filename":
      [
        "output.csv"
      ]
    }
  }
}

+ the output.csv file (not attaching it here)

pomadchin commented 8 months ago

The same code written in Java:

import io.pdal.*;

class MainJava {
  static String json = """
            {
              "pipeline": [
                {
                  "type": "readers.stac",
                  "filename": "https://usgs-lidar-stac.s3-us-west-2.amazonaws.com/ept/item_collection.json",
                  "reader_args": [
                    {
                      "type": "readers.ept",
                      "bounds": "([-71.1356958302804, -71.1354910531836], [42.4274301810906, 42.4277093379612], [0, 30])/EPSG:4326",
                      "threads": "3"
                    }
                  ],
                  "item_ids": [
                    "MA_CentralEastern_1_2021",
                    "MA_CentralEastern_2_2021"
                  ],
                  "properties": {
                    "pc:type": [
                      "lidar",
                      "sonar"
                    ],
                    "pc:encoding": "ept"
                  },
                  "asset_names": [
                    "ept.json"
                  ]
                },
                {
                  "type": "filters.range",
                  "limits": "Classification[0:15]"
                },
                {
                  "type": "writers.text",
                  "format": "csv",
                  "order": "X,Y,Z",
                  "keep_unspecified": "false",
                  "filename": "output.csv"
                }
              ]
            }
  """;

  public static void main(String[] args) {
    var pipeline = new Pipeline(json, LogLevel.Error());
    pipeline.initialize();
    pipeline.execute();
    System.out.println("pipeline.getPointViews().next().length():" + pipeline.getPointViews().next().length());
    System.out.println("pipeline.getMetadata():" + pipeline.getMetadata());
    pipeline.close();
  }
}
hobu commented 8 months ago

We should harden the readers.stac driver to prevent the segfault. @kylemann16

freddyAtPacify commented 8 months ago

@pomadchin Hi Grigory, thanks for following up. I think if you try to validate the pipeline before executing it, you will run into this error. So please try this Java code:

Pipeline pipeline = new Pipeline(json, LogLevel.Error());
pipeline.initialize();
boolean valid = pipeline.validate();
System.out.println( "Valid: " + valid);
pipeline.execute();
System.out.println("pipeline.getPointViews().next().length():" + pipeline.getPointViews().next().length());
System.out.println("pipeline.getMetadata():" + pipeline.getMetadata());
pipeline.close();
pomadchin commented 8 months ago

Hmm indeed it segfaults in this case, ha. validate + execute cause the segfault. 🤔 I guess for now the solution is to not use validate call.

pomadchin commented 8 months ago

@freddyAtPacify this is most likely not the bindings issue; I tried it with a different reader (the simple las/laz reader from the examples folder) and it works ok.

I suspect this behavior is unique to the pipeline and for some reason Stage execution runs are not idempotent.

The code to reproduce is:

Stage *s = m_manager.getStage();
if (s) {
  s->prepare(m_manager.pointTable());
  s->execute(m_manager.pointTable());
  s->execute(m_manager.pointTable()); // adding an extra execution causes a segfault
}

The core of a segfault is smth in the EptReader reader:

C  [libpdalcpp.16.2.0.dylib+0x33028c]  pdal::connector::Connector::get(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) const+0x24
C  [libpdalcpp.16.2.0.dylib+0x3304f4]  pdal::connector::Connector::getJson(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) const+0x34
C  [libpdalcpp.16.2.0.dylib+0x1a6eec]  pdal::EptReader::overlaps()+0x124
C  [libpdalcpp.16.2.0.dylib+0x1a780c]  pdal::EptReader::ready(pdal::BasePointTable&)+0x234
C  [libpdalcpp.16.2.0.dylib+0x2e1ff0]  pdal::Stage::execute(pdal::BasePointTable&, std::__1::set<std::__1::shared_ptr<pdal::PointView>, pdal::PointViewLess, std::__1::allocator<std::__1::shared_ptr<pdal::PointView>>>&)+0x240

Similar issues I was able to find in the PDAL repo, all also around EPT reader:

I think this issue requires some further investigations if you need one. Otherwise just removing the validate function call should be fine. You may also try reproducing the issue in C++ just to be very sure and to have some starting point towards the issue resolution.

.validate function has been deprecated and dropped (https://github.com/PDAL/PDAL/issues/3605) form the Python bindings, we may follow the same pattern.

pomadchin commented 8 months ago

It is not really a solution but I created an issue to drop .validate function https://github.com/PDAL/java/issues/85 and it will be gone in the future releases.

pomadchin commented 8 months ago

.validate() is not available starting pdal-java 2.6.2.