openPMD / openPMD-standard

:notebook: Open Standard for Particle-Mesh Data
http://www.openPMD.org
Creative Commons Attribution 4.0 International
79 stars 29 forks source link

Mesh: Remove Data Order #194

Closed ax3l closed 6 years ago

ax3l commented 6 years ago

The data order for indices arises naturally from its flattened memory layout. This also holds true for logical access to the data.

This affects the base standard attributes:

and the ED-PIC extension attributes:

Implements issues: #125 and #129

Description

dataOrder is superfluous.

By defining 1D-flattened dimensions like axisLabel are ordered by slowest to fastest varying index one can omit dataOrder. Slowest to fastest varying index here only means the logical flattened memory layout that one assumes for this data format.

This is possible since one does not care how data was written but just about how to label/name indices when reading data. E.g., one would increase from fastest to slowest dimension for all flattened descriptions over dimensions to read data efficiently without seeking.

Let's have a C-like to Fortran-like example

A C-matrix A[i,j,k] is written with i index standing for "x", j for "y" and k for "z". The associated axisLabels is in the index order k-j-i (slowest is first) which is ("x", "y", "z").

A Fortran-reader reads this matrix as M(a,b,c) with the index data order a-b-c (c is varies slowest). The axisLabels attribute reads ("x", "y", "z"). So one will assign the fastest varying index c the label "z" , b the label "y" and the fastest varying index a an "x". Index order and index label/name conserved - hurray! :sparkles:

Affected Components

Logic Changes

Before one had to to write the access-index order of the writing code. Now one writes the labels in the order "slowest-to-fastest varying index".

This does not change C-style, but Fortran-style access.

Writer Changes

How does this change affect data writers?

Yes, Fortran-style writers need to invert their order now since the fastest varying index is the last. C-style writers (C/C++/Python/Java/...) can just omit the dataOrder attribute now.

Unaffected besides the now removed attribute:

Does this pull request change the interpretation of existing data writers?

Fortran/Matlab/R/Julia codes: ...

Reader Changes

How does this change affect data readers?

Yes, all readers can now just label/name flattened dimension accesses to e.g. labels of indices in the order "slowest-to-fastest index".

What would a reader need to change? Link implementation examples!

Data Converter

Mesh records that declare dataOrder='F' need to invert the attributes mentioned above.

The unused dataOrder attribute should be removed as well.

mccoys commented 6 years ago

If I understand correctly, C-order F-order data will not be actually modified, but only the following attributes must have the order of their elements reversed when writing from Fortran?

ax3l commented 6 years ago

@mccoys from a C/C++/Python writer perspective (Smilei), yes just invert the order of those attributes when writing openPMD 2.0

update: we changed the order from originally proposed fastest-to-slowest to slowest-to-fastest.

From a Fortran writer perspective, yes just invert the order of those attributes when writing openPMD 2.0 From a C/C++/Python writer perspective everything stays the same.

ax3l commented 6 years ago

There is also an alternative solution to this change, ordering from slowest-to-fastest varying index.

With that the existing Fortran-written files would need to change their index and we would generally have everything in C-like order. Due to the dominance of Py/C++ in our code bases, one could decide for that solution and have more consistency with e.g. numpy attributes >:-)

ax3l commented 6 years ago

@RemiLehe is it ok for you as well if I change the order to slowest to fastest varying index? :)

VC: ok with all.

ax3l commented 6 years ago

all right, I updated the PR and PR description! :)

ax3l commented 6 years ago

@RemiLehe this PR should be ready to merge. Validator and Updater are up-to-date now :)