Better handling of huge tableau

Currently, all tableau of coefficients are hard-coded in the sources. This has at least 2 cons:

really error-prone with very bad visualization due to the 132 characters limit;
not flexible: add/modify tableau require touch the sources.

I think it is much better to read tableau from a separate file at run-time. I like to encode them in JSON by means of json-fortran. However the big question is

where place the default tableau files?

@zbeekman (@rouson @cmacmackin @jacobwilliams and all having a system-knowledge) Maybe I have already asked an opinion about this, but I do not remember your answer.

Do you know if there is some standard (or almost standard) place where unix-like libraries search for their auxiliary-include files read at run-time?

In the case a user do not want to perform a full-installation, but want to use sources (as often happens in our Fortran ecosystem) where should we search for such files? Maybe we need an include directory in the root project...

Note

Some coefficients are well defined integer fractions, it can be very useful to add FortranParser as third party library for parsing tableau with such a coefficients definition.

I'd probably place it in /etc/wenoof/ or within a share directory. When applications are installed from a distro's package manager, such files end up in /usr/share/PACKAGE-NAME/. It would not be appropriate to place them in a directory called include, as this is for header files for libraries.

On 15/01/17 07:06, Stefano Zaghi wrote:

Currently, all tableau of coefficients are /hard-coded/ in the sources. This has at least 2 cons:

really error-prone with very bad visualization due to the 132 characters limit;

not flexible: add/modify tableau require touch the sources.

I think it is much better to read tableau from a separate file at run-time. I like to encode them in JSON by means of json-fortran https://github.com/jacobwilliams/json-fortran. However the big question is
where place the /default/ tableau files?
@zbeekman https://github.com/zbeekman (@rouson https://github.com/rouson @cmacmackin https://github.com/cmacmackin @jacobwilliams https://github.com/jacobwilliams and all having a system-knowledge) Maybe I have already asked an opinion about this, but I do not remember your answer.

Do you know if there is some /standard/ (or almost standard) place where unix-like libraries search for their /auxiliary-include/ files read at run-time?

In the case a user do not want to perform a full-installation, but want to use sources (as often happens in our Fortran ecosystem) where should we search for such files? Maybe we need an include directory in the root project...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Fortran-FOSS-Programmers/WenOOF/issues/18, or mute the thread https://github.com/notifications/unsubscribe-auth/AHxJPbYKcctAuA-640GW2fjKG7azP4j9ks5rScWFgaJpZM4Lj2Td.

-- Chris MacMackin cmacmackin.github.io http://cmacmackin.github.io

I agree with @cmacmackin about where one would place such tables. However, I would recommend, for performance reasons, that you might explore using the tables to generate, at compile or configure/cmake time, generated source files that have the coefficients hard-coded as parameters (compile time constants) that you then either #include or put in a module of compile-time constant coefficients, unless you can verify that reading them in does not have adverse performance impacts.

(I know that I am violating the "premature optimization is the root of all evil" principle, however it is VERY likely that you flux stencil coefficients and/or your smoothness coefficients are going to be in the inner most kernel, and fairly computationally expensive... so I would recommend doing some experiments to check whether making these coefficients compile time constants vs read from a file doesn't have an adverse performance impact)

@cmacmackin

Chris, thank you very much: the right dirs are what I was searching for.

@zbeekman

Zaak, I do not understand the performance issue: coefficient should be load 1 time during the integrator creation, not during the actual usage of integrators. Moreover, I do not understand how to generate tables without hard-coding in some way, either fortran, configure/make, python, etc. To me, having a JSON tables is very handy. Can you elaborate a bit more?

Thank you very much guys!

Cheers

Zaak, I do not understand the performance issue: coefficient should be load 1 time during the integrator creation, not during the actual usage of integrators.

Yes, it is read from disk once, but then it is placed in a variable which is subject to the tyranny of the memory hierarchy (moved around between RAM, L3, L2, L1 and registers). The CPU may not have any guarantees that the value hasn't changed, so it may end up fetching it from further away than it needs to. Compile time constants can be embedded in the instructions themselves, if I understand correctly---which I may not, I am a poor Fortran guy---which means that they may not take up registers that need to be used by other data, and may be fetched along with the instruction. As I said, my understanding here is pretty limited, but I do know that I've heard people who know more about the hardware layer than I do, discussing the merits of compile time constants.

Moreover, I do not understand how to generate tables without hard-coding in some way, either fortran, configure/make, python, etc. To me, having a JSON tables is very handy. Can you elaborate a bit more?

Yes my idea is simple: use generated source code. If you don't wish to write coefficients in hardcoded tables (either because you have a formula that can generate them, or due to readability issues, etc.) then you have another program write the Fortran source code for you before compiling the main library/program. You could put the tables of coefficients into a JSON file and then you could have a python, or Fortran, or some other program that reads the JSON file and writes a Fortran module that has the same tables but as compile time constants like:

module coefficients
  implicit none
  real, parameter :: ISk(4,4) = reshape( [3.0/12.0, 4.5/12.0 ... ! don't remember what the dimensions should be or the coefficients and am too lazy to look it up right now
...
end module

The timings of this implementation could be compared to the version of the code that directly reads the coefficients from the JSON file into memory, without the effort/complication of creating generated sources.

CMake has capabilities to handle generated sources. It would be a bit more complicated, perhaps, to roll your own, but you can do it with a makefile or another means.

I hope I have been more clear.

Il 15 gen 2017 6:55 PM, "Izaak Beekman" notifications@github.com ha scritto:

Zaak, I do not understand the performance issue: coefficient should be load 1 time during the integrator creation, not during the actual usage of integrators.

Yes, it is read from disk once, but then it is placed in a variable which is subject to the tyranny of the memory hierarchy (moved around between RAM, L3, L2, L1 and registers). The CPU may not have any guarantees that the value hasn't changed, so it may end up fetching it from further away that it needs to. Compile time constants can be embedded in the instructions themselves, if I understand correctly---which I may not, I am a poor Fortran guy---which means that they may not take up registers that need to be used by other data, and may be fetched along with the instruction. As I said, my understanding here is pretty limited, but I do know that I've heard people who know more about the hardware layer than I do, discussing the merits of compile time constants.

Dear @zbeekman, thank you for your idea: I agree with you that using parameters could be better for performance reasons...

Moreover, I do not understand how to generate tables without hard-coding in some way, either fortran, configure/make, python, etc. To me, having a JSON tables is very handy. Can you elaborate a bit more?

Yes my idea is simple: use generated source code. If you don't wish to write coefficients in hardcoded tables (either because you have a formula that can generate them, or due to readability issues, etc.) then you have another program write the Fortran source code for you before compiling the main library/program. You could put the tables of coefficients into a JSON file and then you could have a python, or Fortran, or some other program that reads the JSON file and writes a Fortran module that has the same tables but as compile time constants like:

module coefficients implicit none real, parameter :: ISk(4,4) = reshape( [3.0/12.0, 4.5/12.0 ... ! don't remember what the dimensions should be or the coefficients and am too lazy to look it up right now ...end module

The timings of this implementation could be compared to the version of the code that directly reads the coefficients from the JSON file into memory, without the effort/complication of creating generated sources.

CMake has capabilities to handle generated sources. It would be a bit more complicated, perhaps, to roll your own, but you can do it with a makefile or another means.

I hope I have been more clear.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/Fortran-FOSS-Programmers/WenOOF/issues/18#issuecomment-272711559, or mute the thread https://github.com/notifications/unsubscribe-auth/AGpxIYibJQTAZVB34OSVcxbB-jZUdiKHks5rSl2HgaJpZM4Lj2Td .

Dear @zbeekman, thank you for your idea: I agree with you that using parameters could be better for performance reasons...

You won't know for sure until you can compare the techniques... but I just thought it was worth mentioning since it is likely that the Smoothness computation is an expensive, inner-most kernel.

@zbeekman

Zaak, thank you for your insight.

Yes, it is read from disk once, but then it is placed in a variable which is subject to the tyranny of the memory hierarchy (moved around between RAM, L3, L2, L1 and registers). The CPU may not have any guarantees that the value hasn't changed, so it may end up fetching it from further away than it needs to. Compile time constants can be embedded in the instructions themselves, if I understand correctly---which I may not, I am a poor Fortran guy---which means that they may not take up registers that need to be used by other data, and may be fetched along with the instruction. As I said, my understanding here is pretty limited, but I do know that I've heard people who know more about the hardware layer than I do, discussing the merits of compile time constants.

Oh, sorry, I did not focused that you were referring to parameters, my bad. Sure, parameters are always better-handled (I hope) than other memories, but in this specific case I did not consider them for some practical issues (see below).

Yes my idea is simple: use generated source code. If you don't wish to write coefficients in hardcoded tables (either because you have a formula that can generate them, or due to readability issues, etc.) then you have another program write the Fortran source code for you before compiling the main library/program. You could put the tables of coefficients into a JSON file and then you could have a python, or Fortran, or some other program that reads the JSON file and writes a Fortran module that has the same tables but as compile time constants like:

module coefficients
  implicit none
  real, parameter :: ISk(4,4) = reshape( [3.0/12.0, 4.5/12.0 ... ! don't remember what the dimensions should be or the coefficients and am too lazy to look it up right now
...
end module

Ok, this is an option, but has its own cons.

Currently, we have 8 different set of polynomial-coefficients and linear (optimal) coefficients and 3 different WENO (JS, JS-Z, JS-M) resulting in 24 different integrators: just from its born, it was clear for me that, to preserve easy maintenance/improvements and allow a lot of different schemes, I need a flexible OOP pattern. The strategy pattern is very attractive in this scenario and allocatable variables are ubiquitous here. This was the main reason why I never considered parameters. I have full-thrust in your experience, thus if you think it worth to try, I do.

Your coefficient modules should become something like:

module wenoof_coefficients
  implicit none
  private
  public :: beta-S2, beta-S3...., beta-S8
  public :: gamma-S2, gamma-S3..., gamma-S8 

  real(R_P), parameter, target :: beta-S2(...:...,...:...) = reshape([....], ...)
  real(R_P), parameter, target :: beta-S3(...:...,...:...) = reshape([....], ...)
....
    real(R_P), parameter, target :: gamma-S8(...:,...:...) = reshape([....], ...)
endmodule wenoof_coefficients

I used target specification for the following reason: when a user instantiate an interpolator (s)he must select the accuracy, namely the stencils number/dimension. Thus when performing the interpolation the logic possible are essentially 2:

for each interpolate call we must check the number S (by means of an if-elseif or select case construct) and the access to the right beta-S# and gamma-S#) ;
avoid the check by building the interpolator with the proper set of coefficients inside it:

Currently, we adopt the second approach: by a strategy pattern the interpolator is constructed with the proper set of coefficients that are stored into allocatables members of the interpolator.

Now, if we want to have parameter-coefficients while avoiding the S-check for each interpolate we have few options:

provide a set concrete interpolators with hard-coded reference to the proper parameter-coefficients set;
make the generic interpolator coefficients a pointer to the proper parameter-coefficients set;

Namely:

! concrete approach

type :: inpterpolator_S2
  contains
    procedure :: interpolate ! here beta-S2 and gamma-S2 are directly accessed
endtype interpolator_S2

type :: inpterpolator_S3
  contains
    procedure :: interpolate ! here beta-S3 and gamma-S3 are directly accessed
endtype interpolator_S3

! and so on...

! pointer approach

type :: inpterpolator
  real(R_P), pointer :: beta(:,:)
  real(R_P), pointer :: gamma(:,:)
  contains
    procedure :: init ! here beta and gamma are associated to the correct beta-S, gamma-S
    procedure :: interpolate ! here the local beta, gamma members are accessed 
endtype interpolator

Is it possible to associate a pointer to a parameters right? If so, is the memory handling still good?

At the end, I am really in doubt about which approach is better and overall if the performance will increase. As a matter of fact, while the coefficients are surely constants, smoothness indicators are not and must be stored in dynamic memory: the tyranny of memory hierarchy cannot be completely avoided.

My afraid is mostly about code-simplicity-conciseness-clearness: Damian (@rouson) teach me how it is important to be KISS and handling coefficients by parameters looks very complex...

You won't know for sure until you can compare the techniques... but I just thought it was worth mentioning since it is likely that the Smoothness computation is an expensive, inner-most kernel.

I'll try to verify the performance difference with 1 case if I'll find the time.

Zaak, thank you again your help is priceless.

Cheers

I'm not the Fortran expert here, but if we use a JSON file to read the coefficients, is it possible to store them into parameters? In practice, to adopt a "mixed" approach between our requirement of an iterpolator with the proper set of coefficients inside it and the use of parameters that could be very useful in terms of performance... I don't know if this approach is viable...

Giacomo Rossi Ph.D., Space Engineer

Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza" University of Rome p: (+39) 0692927207 | *m*: (+39) 3408816643 | e: giacombum@gmail.com giacomo.rossi@uniroma1.it Member of Fortran-FOSS-programmers https://github.com/Fortran-FOSS-Programmers

2017-01-16 6:18 GMT+01:00 Stefano Zaghi notifications@github.com:

@zbeekman https://github.com/zbeekman

Zaak, thank you for your insight.

Yes, it is read from disk once, but then it is placed in a variable which is subject to the tyranny of the memory hierarchy (moved around between RAM, L3, L2, L1 and registers). The CPU may not have any guarantees that the value hasn't changed, so it may end up fetching it from further away than it needs to. Compile time constants can be embedded in the instructions themselves, if I understand correctly---which I may not, I am a poor Fortran guy---which means that they may not take up registers that need to be used by other data, and may be fetched along with the instruction. As I said, my understanding here is pretty limited, but I do know that I've heard people who know more about the hardware layer than I do, discussing the merits of compile time constants.

Oh, sorry, I did not focused that you were referring to parameters, my bad. Sure, parameters are always better-handled (I hope) than other memories, but in this specific case I did not consider them for some practical issues (see below).

Yes my idea is simple: use generated source code. If you don't wish to write coefficients in hardcoded tables (either because you have a formula that can generate them, or due to readability issues, etc.) then you have another program write the Fortran source code for you before compiling the main library/program. You could put the tables of coefficients into a JSON file and then you could have a python, or Fortran, or some other program that reads the JSON file and writes a Fortran module that has the same tables but as compile time constants like:

module coefficients implicit none real, parameter :: ISk(4,4) = reshape( [3.0/12.0, 4.5/12.0 ... ! don't remember what the dimensions should be or the coefficients and am too lazy to look it up right now ...end module

Ok, this is an option, but has its own cons.

Currently, we have 8 different set of polynomial-coefficients and linear (optimal) coefficients and 3 different WENO (JS, JS-Z, JS-M) resulting in 24 different integrators: just from its born, it was clear for me that, to preserve easy maintenance/improvements and allow a lot of different schemes, I need a flexible OOP pattern. The strategy pattern is very attractive in this scenario and allocatable variables are ubiquitous here. This was the main reason why I never considered parameters. I have full-thrust in your experience, thus if you think it worth to try, I do.

Your coefficient modules should become something like:

module wenoof_coefficients implicit none private public :: beta-S2, beta-S3...., beta-S8 public :: gamma-S2, gamma-S3..., gamma-S8

real(R_P), parameter, target :: beta-S2(...:...,...:...) = reshape([....], ...) real(R_P), parameter, target :: beta-S3(...:...,...:...) = reshape([....], ...) .... real(R_P), parameter, target :: gamma-S8(...:,...:...) = reshape([....], ...)endmodule wenoof_coefficients

I used target specification for the following reason: when a user instantiate an interpolator (s)he must select the accuracy, namely the stencils number/dimension. Thus when performing the interpolation the logic possible are essentially 2:

for each interpolate call we must check the number S (by means of an if-elseif or select case construct) and the access to the right beta-S# and gamma-S#) ;

avoid the check by building the interpolator with the proper set of coefficients inside it:

Currently, we adopt the second approach: by a strategy pattern the interpolator is constructed with the proper set of coefficients that are stored into allocatables members of the interpolator.

Now, if we want to have parameter-coefficients while avoiding the S-check for each interpolate we have few options:

provide a set concrete interpolators with hard-coded reference to the proper parameter-coefficients set;

make the generic interpolator coefficients a pointer to the proper parameter-coefficients set;

Namely:

! concrete approach

type :: inpterpolator_S2 contains procedure :: interpolate ! here beta-S2 and gamma-S2 are directly accessed endtype interpolator_S2

type :: inpterpolator_S3 contains procedure :: interpolate ! here beta-S3 and gamma-S3 are directly accessed endtype interpolator_S3 ! and so on... ! pointer approach

type :: inpterpolator real(R_P), pointer :: beta(:,:) real(R_P), pointer :: gamma(:,:) contains procedure :: init ! here beta and gamma are associated to the correct beta-S, gamma-S procedure :: interpolate ! here the local beta, gamma members are accessed endtype interpolator

Is it possible to associate a pointer to a parameters right? If so, is the memory handling still good?

At the end, I am really in doubt about which approach is better and overall if the performance will increase. As a matter of fact, while the coefficients are surely constants, smoothness indicators are not and must be stored in dynamic memory: the tyranny of memory hierarchy cannot be completely avoided.

My afraid is mostly about code-simplicity-conciseness-clearness: Damian ( @rouson https://github.com/rouson) teach me how it is important to be KISS and handling coefficients by parameters looks very complex...

You won't know for sure until you can compare the techniques... but I just thought it was worth mentioning since it is likely that the Smoothness computation is an expensive, inner-most kernel.

I'll try to verify the performance difference with 1 case if I'll find the time.

Zaak, thank you again your help is priceless.

Cheers

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/Fortran-FOSS-Programmers/WenOOF/issues/18#issuecomment-272776064, or mute the thread https://github.com/notifications/unsubscribe-auth/AGpxIT067pHNkr6qg8dIpaWCFfvB0_MPks5rSv2WgaJpZM4Lj2Td .

@giacombum

I'm not the Fortran expert here, but if we use a JSON file to read the coefficients, is it possible to store them into parameters?

Nope if you do it at run-time in the library, parameters are compile-time constants. If you want JSON-formatted coefficients you must go with the Zaak suggestion: a pre-processor that read JSON before you compile WenOOF.

Fortran-FOSS-Programmers / WenOOF

Better handling of huge tableau #18

Note