pdidev / pdi

The PDI Data Interface
https://pdi.dev
BSD 3-Clause "New" or "Revised" License
5 stars 0 forks source link

PDI in ParFLow #169

Closed jbigot closed 2 months ago

jbigot commented 4 years ago

In GitLab by @mlobet on Mar 16, 2020, 11:21

ParFlow Data Structures

ParFlow PDI repository : https://gitlab.maisondelasimulation.fr/eocoe-ii/parflow-pdi

ParFlow GitHub : https://github.com/parflow/parflow

Vector and derived Structure

Physical data is stored in the structure Vector. This structure is described in the file /pfsimulator/parflow_lib/vector.h.

It can be described in simple way like this:

typedef struct _Vector {
  Subvector    **subvectors;    /* Array of pointers to subvectors */

  int data_size;                /* Number of elements in vector.
                                 * All subvectors. includes ghost points */

  Grid          *grid;          /* Grid that this vector is on */

  SubgridArray  *data_space;    /* Description of Vector data */

  int size;                     /* Total number of coefficients */

  /* Information on how to update boundary */
  CommPkg *comm_pkg[NumUpdateModes];

  enum vector_type type;

} Vector;

We can see that there is a pointer array pointing toward objects from another structure called Subvector. They are sub-vectors containing the physical data.

In additon to this, there is this sub-structure called Grid.

The Subvector structure is defined in the same file.

typedef struct {
  double  *data;              /* Pointer to subvector data */

  int allocated;                  /* Was this data allocated? */

  Subgrid *data_space;

  int data_size;                /* Number of elements in vector,
                                 * includes ghost points */
} Subvector;

This structure contains a double array data that constitutes the physical data. This is what is written during diagnostics.

The Grid structure contains topology information. It is defined in grid.h. It is used to output some local metadata.

typedef Subregion Subgrid;
typedef SubregionArray SubgridArray;

typedef struct {
  SubgridArray  *subgrids;      /* Array of subgrids in this process */

  SubgridArray  *all_subgrids;  /* Array of all subgrids in the grid */

  int size;                     /* Total number of grid points */

  ComputePkg   **compute_pkgs;

  Subgrid        *background;    /* The background reference grid for
                                  * this grid.   Includes the entire
                                  * space of points that all subgrids
                                  * lie in.  Basically the bounding
                                  * box for the subgrids */
} Grid;

The subregion and subgrid structures represent the same thing. Subregion is described in region.h.

typedef struct {
  Subregion  **subregions;    /* Array of pointers to subregions */
  int size;                   /* Size of subgregion array */
} SubregionArray;
typedef struct {
  int ix, iy, iz;        /* Bottom-lower-left corner in index-space */
  int nx, ny, nz;        /* Size */
  int sx, sy, sz;        /* Striding factors */
  int rx, ry, rz;        /* Refinement over the background grid */
  int level;             /* Refinement level = rx + ry + rz */

  int process;           /* Process containing this subgrid */
} Subregion;

Outputs

Outputs are performed in the files write_parflow_*.c where * represents the type of output (silo, netcdf, binary...). for the moment, we focus on home-made output in binary mode. The idea is to rerpoduce them with PDI.

Sortie binaires

We focus here on the file write_parflow_binary.c. Outputs are described in the documentation. The generic functions that enable to dump any type of physical data are WritePFBinary and WritePFBinary_Subvector. Files with the home-made extension .pfb (for Parflow Binary) have the following structure:

<double : X> <double : Y> <double : Z>
<integer : NX> <integer : NY> <integer : NZ>
<double : DX> <double : DY> <double : DZ>

<integer : num_subgrids>

FOR subgrid = 0 TO <num_subgrids> - 1
BEGIN
  <integer : ix> <integer : iy> <integer : iz>
  <integer : nx> <integer : ny> <integer : nz>
  <integer : rx> <integer : ry> <integer : rz>
  FOR k = iz TO iz + <nz> - 1
  BEGIN
    FOR j = iy TO iy + <ny> - 1
    BEGIN
      FOR i = ix TO ix + <nx> - 1
      BEGIN
        <double : data_ijk>
      END
    END
  END
END

We can find the different components of the Vector structure.

With MPI, each rank will create a file containing its own data.

jbigot commented 4 years ago

In GitLab by @ksiero on Mar 19, 2020, 09:52

I have created yaml file that tries to describe the Vector structure. For now in PDI we don't support yaml anchors, so the types definitions will repeat. When you see & in code treat it like a reference to the corresponding record type (names are commented #).

I think this would be a great test that we could add to the PDI to see if operations on types are working correctly.

@mlobet Could you see if this yaml is correct? parflow.yml

jbigot commented 4 years ago

closed