mspass-team / mspass

Massive Parallel Analysis System for Seismologists
https://mspass.org
BSD 3-Clause "New" or "Revised" License
29 stars 12 forks source link

pybind11 and new modules #96

Open pavlis opened 3 years ago

pavlis commented 3 years ago

In testing the deconvolution code I encountered a bug created, I am pretty sure, by the step taken a few weeks ago to break ccore into multiple modules. I had a mysterious bug it took some time to track down created by an undetected (in compilation that is) problem with the mapping of std:: vector containers. I haven't checked this in yet so don't do this yourself as the next branch I push for fixing decon stuff will have the fix. What was needed is these two include statements in deconvolution_py.cc:

#include <pybind11/stl.h>
#include <pybind11/stl_bind.h>

this line prior to the namespace encapsulation:

PYBIND11_MAKE_OPAQUE(std::vector<double>);

and this line at the top of the module definition:

  py::bind_vector<std::vector<double>>(m, "DoubleVector");

That was necessary in the deconvolution_py.cc binding code because the getresult method returns an std::vector.

I noticed in seismic.py you also included these two lines to support ensembles:

  py::bind_vector<std::vector<TimeSeries>>(m, "TimeSeriesVector");
  py::bind_vector<std::vector<Seismogram>>(m, "SeismogramVector");

We must remember to add this set of items for any module that needs to handle any STL C++ containers for which pybind11 had wrappers (for this example it is std::vector, but similar constructs can apparently be used to bind other STL containers. Notably I am not sure if we don't have lurking bugs for items returning and STL list container, but the one's I've used seem to work fine. I suspect it is because a list is generic in python and the stl.h and stl_bind,.h code handles that without the problems that bit me more a vector container.

The problem seems most acute for C++ functions that return a C++ STL vector. It was interesting that I could pass things like TimeSeries objects that contained an std::vector without problems. The reason, I'm about 99% sure, is that the C code is just taking an opaque pointer form the python interpreter and once cast to a TimeSeries all the issues are inside the C++ code. When python has to handle a return it is a very different matter.

Anyway this is a VERY IMPORTANT lesson to keep in mind as we expand the pybind11 code base in setting up or modifying any module defined by pybind11.

wangyinz commented 3 years ago

Good catch! I don't think I was aware of such an issue when breaking up the ccore module. We probably should keep a note of it somewhere. Maybe at least in a comment section near these source code.