ecmwf / atlas

A library for numerical weather prediction and climate modelling
https://sites.ecmwf.int/docs/atlas
Apache License 2.0
115 stars 42 forks source link

Fix cubed sphere interpolation inconsistency #151

Closed odlomax closed 1 year ago

odlomax commented 1 year ago

Hi @mayeuldestouches, @wdeconinck

Here's a follow up to https://github.com/ecmwf/atlas/pull/146. @Mayeuldestouches identified that cubed sphere interpolation operations can give slightly different results when the the source grid is partitioned in different way (see below).

Difference between a default and equal-regions partitioned cubed sphere interpolated to a Gaussian mesh. interpolation_diff

The differences are about 1 part in 10^5 at C24 resolution (reduces with increased resolution). This is due to the interpolation coordinate systems on different tiles -- the partitioner would arbitrarily pick a tile at boundary locations.

I've modified the cubed sphere matching mesh practitioner to only ever check non-ghost cells. The discrepancy is no longer present.

@mayeuldestouches, can you check this works and add a link to the original issue (if you have one) in this PR discussion.

Cheers!

MayeulDestouches commented 1 year ago

Thanks @odlomax.

It looks like the interpolation doesn't run to the end now with 2, 3, and 4 MPI tasks. With 4 tasks for instance:

Info     : Creating outer block: gauss to cubed-sphere-dual
Exception: Could not find partition for target node (source mesh does not contain all target grid points)
9 misses.  ([...]/atlas/src/atlas/grid/detail/partitioner/MatchingMeshPartitionerCubedSphere.cc +59 partition)

As if this was resuscitating the bug you fixed with the magic number.

odlomax commented 1 year ago

Thanks @odlomax.

It looks like the interpolation doesn't run to the end now with 2, 3, and 4 MPI tasks. With 4 tasks for instance:

Info     : Creating outer block: gauss to cubed-sphere-dual
Exception: Could not find partition for target node (source mesh does not contain all target grid points)
9 misses.  ([...]/atlas/src/atlas/grid/detail/partitioner/MatchingMeshPartitionerCubedSphere.cc +59 partition)

As if this was resuscitating the bug you fixed with the magic number.

You're right! I've made things worse! 😥

wdeconinck commented 1 year ago

Please let me know when you want me to review by removing "draft" from it.

odlomax commented 1 year ago

I am going to close this PR without merging. I've fixed the crashing in a previous PR. This issue to to do with bit-level reproducibility (the results are still numerically accurate to the order of the interpolation).

Basically I can't figure out how long it would take to address it and it's difficult to justify the time. In the long run I'd like to see this cubed sphere implementation be replaced my a much simpler and general set of classes anyway.