GeoStat-Framework / GSTools-Core

A Rust implementation of the core algorithms of GSTools.
GNU Lesser General Public License v3.0
10 stars 0 forks source link

Vision #9

Open MuellerSeb opened 2 years ago

MuellerSeb commented 2 years ago

I really like the pace of this project and I am really thankful that @LSchueler started this and @adamreichold jumped in and already put so much work into speeding this up. I didn't have the time to take a deeper look but I already digged through some rust tutorials to catch up some day :wink:

At some point I'd like to discuss, what the aim and the vision of this package is. I could imagine this repository to be the common part of PyKrige and GSTools for the upcomming version 2 of both packages to increase interoperability. We already created this project to track this: https://github.com/orgs/GeoStat-Framework/projects/1

We now already have some routines implemented here, that can be used by both projects (kriging summation and variogram estimation)

What we would need is:

One problem I see at the moment is the limit set of available "special" functions in rust, that are needed for some covariance models. There are some libraries already available, but with a limited set of functions (we need an overview of what is needed):

This package could then also be a geostatistical package for rust (ATM I only see friedrich) with python-bindings for GSTools and PyKrige as described above. This would be awesome! :tada:

@LSchueler @adamreichold what do you think?

Cheers, Sebastian

adamreichold commented 2 years ago

netcdf interface

Are data exchange formats really something the core code needs to know? What problems are there with passing NumPy arrays around without indicating whether those we loaded from or will be stored into NetCDF files?

One problem I see at the moment is the limit set of available "special" functions in rust

There are bindings for the GNU Scientific Library which includes quite a few of those: https://www.gnu.org/software/gsl/doc/html/specfunc.html The worst case here would probably be that additional bindings need to be written.

LSchueler commented 2 years ago

Thanks for compiling this. I think this is a really cool project. Regarding the special functions, the worst case scenario (which wouldn't be too bad) is to use C or C++ libraries for the more exotic ones, which could be replaced one by one when Rust implementations become available.

@adamreichold This would be much more than just the core computations, but rather a complete geostatistical Rust library. There are some applications where huge data sets (TB or at least 100s of GB) need to be processed. In these cases it is necessary to flush the data to disk every once in a while. Therefore the NetCDF interface.

LSchueler commented 2 years ago

As a first step, I would suggest to include the calls to GSTools-Core in GSTools and replace the Cython code. This would include cleaning up and reworking the deployment of GSTools. I think the experience we gather in that process will help later on. And I think we are nearly there. What do you think?

adamreichold commented 2 years ago

I think the experience we gather in that process will help later on. And I think we are nearly there.

I think an incremental approach is almost always preferable especially since this would avoid having to chase a constantly moving feature set but being able to port/optimize what benefits from the effort required to do so.

In these cases it is necessary to flush the data to disk every once in a while. Therefore the NetCDF interface.

So this is about incremental processing which cannot be expressed as "read a chunk of the data into an array; process that array; write the results out; repeat"?

adamreichold commented 2 years ago

Is there a branch/pull request somewhere which prototypes integration of this crate/package into one of the target Python packages?

LSchueler commented 2 years ago

I'm currently working on that. I'll post a link here as soon as it's pushed.

LSchueler commented 2 years ago

So this is about incremental processing which cannot be expressed as "read a chunk of the data into an array; process that array; write the results out; repeat"?

I'm not familiar with this specific application. @MuellerSeb, can you tell us something about the problems you faced there?

LSchueler commented 2 years ago

The GSTools-Core - GSTools integration is being prepared in this branch. It's getting exciting! This is the relevant PR.

MuellerSeb commented 2 years ago

Required special functions

Additionally needed for the spectral densities or similar

So it seams, that all required special functions are available.

MuellerSeb commented 2 years ago

2F1 is also provided here: https://docs.rs/mathru/latest/mathru/special/hypergeometric/fn.f21.html