Feature/f32 precision - Githubissues

FlorianDeconinck commented 1 year ago

Purpose

Main feature is enable 32-bit float, but this also covers hotfix & quality of life improvements to orchestration

32-bit float throughout (GT4Py MERGE PENDING)
hotfix for orchestration for GT4Py v1
using pace logger to punch through stdout in hybrid use
optimizing re-entry call to orchestration for hybrid use
QoL: BuildAndRun will truly build, then use the most efficient path to Run all subsequent runs
QoL: build on checks is optional for now due to tuple != tuple weird bug

Code changes:

PACE_FLOAT_PRECISION env var controls the precision of the float program wide. Defaults to 64 for backward compatibility.
Introduce FrozenCSDFG which froze argument manipulation when using orchestration. Caveat: if any reallocation occurs, the system will crash in non trivial ways
Build / Run function path in Orchestration have been clean up and used a FrozenCSDFG interface for calss

Requirements changes:

GTPy update (PENDING)

Infrastructure changes:

No CI infra available, tested on local box.

Checklist

Before submitting this PR, please make sure:

[x] You have followed the coding standards guidelines established at Code Review Checklist.
[x] Docstrings and type hints are added to new and updated routines, as appropriate
[x] All relevant documentation has been updated or added (e.g. README, CONTRIBUTING docs)
[x] For each public change and fix in pace-util, HISTORY has been updated

FlorianDeconinck commented 1 year ago

Waiting for https://github.com/GridTools/gt4py/pull/1139

FlorianDeconinck commented 1 year ago

The calculation should still be in 64-bit and then cast'ed down to 32-bit. I will run a comparison test to make sure of it

bensonr commented 1 year ago

@FlorianDeconinck - I only looked at the changes and didn't study the full logic, so I easily could've misinterpreted the changes in grid-gen to support the different precision options.

FlorianDeconinck commented 1 year ago

So good catch @bensonr, it indeed compute in 32-bit float which leads to significant differences, especially on area. I try to revert that side of the code to compute 64-bit for the time being, but it leads to a NaN somewhere down the line, so this is going to take me a few more cycles to squash.

Merge on hold until I fix this

bensonr commented 1 year ago

@FlorianDeconinck - important safety tip from my experience modifying the fortran code to use 32-bit reals, make sure PI remains a 64-bit quantity with all the digits...

FlorianDeconinck commented 1 year ago

@bensonr yeah I did make sure PI doesn't get downcasted. So I've push the revised code which does computation on 64-bit then cast it down to 32-bit (if the model precision is 32-bit of course).

Here's a sample of the errors you get at c180 between 64-bit and 32-bit, does that sound reasonable to you?

ak: Max diff 4.6875000407453626e-05
area: Max diff 127.98467206954956
area_64: Max diff 127.98467206954956
bk: Max diff 4.775009276869469e-10
cosa: Max diff 2.911541485683955e-08
dp_ref: Max diff 0.00022046875528758392
dx: Max diff 0.0019519865309121087
dxa: Max diff 0.0019521279245964251
dxc: Max diff 0.0019525757306837477
dy: Max diff 0.0019519860143191181
dya: Max diff 0.0019521230497048236
dyc: Max diff 0.0019525702809914947
edge_e: Max diff 8.611894442012158e-08
edge_n: Max diff 1.9236746795492365e-07
edge_s: Max diff 1.9236746817696826e-07
edge_w: Max diff 2.804534132150316e-07

bensonr commented 1 year ago

@FlorianDeconinck - when I compare the diffs with the magnitude of the quantities, these numbers all seem reasonable. I'm reviewing your latest commit now.

bensonr commented 1 year ago

Overall this looks good! The one potential issue is that it looks like the code now uses lower-precision values to calculate some of the grid variables, especially divg/del6, z11/12/21/22, a11/a2/21/22 (maybe also the trig terms and edge factors). My instinct is that that's an error source worth being careful about, but if @bensonr disagrees I'm happy to be wrong there.

No no, you are pointing out the many things I missed in my review. While I could blame it on my being a python newbie, tbh, I just plain bungled it.

FlorianDeconinck commented 1 year ago

Update errors


ak: Max diff 4.6875000407453626e-05
area: Max diff 127.98467206954956
area_64: Max diff 127.98467206954956
bk: Max diff 4.775009276869469e-10
cosa: Max diff 2.911541485683955e-08
dp_ref: Max diff 0.00022046875528758392
dx: Max diff 0.0019519865309121087
dxa: Max diff 0.0019521279245964251
dxc: Max diff 0.0019525757306837477
dy: Max diff 0.0019519860143191181
dya: Max diff 0.0019521230497048236
dyc: Max diff 0.0019525702809914947
edge_e: Max diff 2.8874341695406258e-08
edge_n: Max diff 2.956240208185079e-08
edge_s: Max diff 2.956240208185079e-08
edge_w: Max diff 2.8603534540927456e-08
fC: Max diff 1.1017950739050306e-11
fC_agrid: Max diff 1.1712794037617436e-11
lat: Max diff 2.9787201594189128e-08
lat_agrid: Max diff 2.980160229704154e-08
lon: Max diff 2.3567830353954378e-07
lon_agrid: Max diff 2.384155362022966e-07
p: Max diff 0.7610267820564331
p_ref: Max diff 0.0
ptop: Max diff 0.0
rsina: Max diff 5.958147619722354e-08

NOAA-GFDL / pace

Feature/f32 precision #1

Purpose

Code changes:

Requirements changes:

Infrastructure changes:

Checklist