esmf-org / esmf

The Earth System Modeling Framework (ESMF) is a suite of software tools for developing high-performance, multi-component Earth science modeling applications.
https://earthsystemmodeling.org/
Other
156 stars 75 forks source link

ESMPY: Add ability to mask nodes of a Mesh #204

Closed mcflugen closed 2 months ago

mcflugen commented 9 months ago

This pull request adds the ability for a user to associate a node mask with an ESMPY Mesh.

The Mesh.add_nodes method now accepts a keyword, node_mask, through which a user can pass a mask array of length n_nodes. Previously it was only possible to associate a mask with a mesh's elements. This new functionality follows the same pattern as Mesh.add_elements.

To make this work I had to add an extra argument, nodeMask, to ESMC_MeshAddNodes and, as such, modified all existing calls to this function to include the value NULL for this extra argument (i.e. no masking).

All of the existing tests pass and this fixes the issue I was having, but I haven't added any new tests yet but I am happy to do so.

This, of course, supersedes #203, which was created in error.

mcflugen commented 9 months ago

@oehmke This is the PR we talked about, if you want to have a look when you find some time.

mcflugen commented 9 months ago

To do,

mcflugen commented 9 months ago

I think this is ready for someone to have a look at now.

rokuingh commented 8 months ago

@mcflugen Thanks for this contribution! I am just getting back into the swing of things around here. I spent what time I have today, and noticed several esmpy test failures with this pr. Didn't have much time to dig, but it looks like the internal Mesh C++ layer is receiving invalid node_ids:

20240207 122424.785 ERROR PET0 /home/ryan/Dropbox/sandbox/esmf/src/Infrastructure/Mesh/src/ESMCI_Mesh_Glue.C:189 ESMCI_meshaddnodes() Value unrecognized or out of range - node ids must be >= 1

This error came from running the following test:

python3 -m pytest src/esmpy/test/test_api/test_mesh.py::test_add_nodes_with_mask[None]

I have a small amount of time in the mornings right now, so I'll keep digging when I can.

oehmke commented 8 months ago

I would guess that what’s happening is that there are node Ids = 0 going into add_nodes(). At least on the F90/C++ side node ids are supposed to be >0.

On Feb 7, 2024, at 1:32 PM, rokuingh @.***> wrote:

@mcflugen https://github.com/mcflugen Thanks for this contribution! I am just getting back into the swing of things around here. I spent what time I have today, and noticed several esmpy test failures with this pr. Didn't have much time to dig, but it looks like the internal Mesh C++ layer is receiving invalid node_ids:

20240207 122424.785 ERROR PET0 /home/ryan/Dropbox/sandbox/esmf/src/Infrastructure/Mesh/src/ESMCI_Mesh_Glue.C:189 ESMCI_meshaddnodes() Value unrecognized or out of range - node ids must be >= 1

This error came from running the following test:

python3 -m pytest src/esmpy/test/test_api/test_mesh.py::test_add_nodes_with_mask[None]

I have a small amount of time in the mornings right now, so I'll keep digging when I can.

— Reply to this email directly, view it on GitHub https://github.com/esmf-org/esmf/pull/204#issuecomment-1932818244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE6A7UYFA37DPHV3J3LE2RTYSPQFPAVCNFSM6AAAAABBAT4XYKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZSHAYTQMRUGQ. You are receiving this because you were assigned.

rokuingh commented 7 months ago

I just pushed some mods to fix the esmpy test failures on this pr. There is one failure remaining, although it seems unrelated at the moment, will check on develop.

oehmke commented 7 months ago

What’s the failure?

On Feb 8, 2024, at 1:23 PM, rokuingh @.***> wrote:

I just pushed some mods to fix the esmpy test failures on this pr. There is one failure remaining, although it seems unrelated at the moment, will check on develop.

— Reply to this email directly, view it on GitHub https://github.com/esmf-org/esmf/pull/204#issuecomment-1934876200, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE6A7U4X7SL7TQZXLANRNBLYSUX5DAVCNFSM6AAAAABBAT4XYKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZUHA3TMMRQGA. You are receiving this because you were assigned.

billsacks commented 4 months ago

I just tried running ESMPy testing on this branch (updated to the latest version of develop) and got failures in the newly-added tests. (If I backed out the changes to the two test files in the PR, all ESMPy tests passed.)

@rokuingh you mentioned having pushed fixes for the tests, but I don't see any commits in this PR from you. Is it possible you pushed the changes to somewhere else? When you get a chance can you please push them to this branch or point us to the branch with the fixes, then we can confirm that all tests pass before merging?

anntsay commented 3 months ago

Waiting on Ryan's response.. will make deicsion on July 3rd based on whether we hear from Ryan.

billsacks commented 2 months ago

@rokuingh pointed me to the branch where he has fixed the test issues: https://github.com/esmf-org/esmf/tree/pr202 - specifically see https://github.com/esmf-org/esmf/commit/dc53aad6082be6f7f09dc9737934d251d59510c1 . I am confirming that tests pass on that branch; once I do, I'll merge it.

billsacks commented 2 months ago

I have updated the branch in this PR to include the commit referenced above, as well as updating to the latest develop.

I'm having trouble with the testing here, though (on my Mac, green with gfortran):

ESMF testing fails in gmake all_tests (with ESMF_TESTEXHAUSTIVE=ON) with:

mpicc -c -std=c99 -g -fPIC -pthread -I/Users/sacks/esmf/esmf0/src/Infrastructure/Mesh/tests -I/Users/sacks/esmf/esmf0/src/Infrastructure/Mesh/tests/../include -I/Users/sacks/esmf/esmf0/build_config/Darwin.gfortranclang.default -I/Users/sacks/esmf/esmf0/src/Infrastructure -I/Users/sacks/esmf/esmf0/src/Superstructure -I/Users/sacks/esmf/esmf0/src/include   -DESMF_NO_INTEGER_1_BYTE -DESMF_NO_INTEGER_2_BYTE -DESMF_VERSION_STRING_GIT='"v8.7.0b09-12-g7b2b3f01b2"' -DESMF_LOWERCASE_SINGLEUNDERSCORE -DESMF_MOAB=1 -DESMF_LAPACK=1 -DESMF_LAPACK_INTERNAL=1 -DESMF_NO_ACC_SOFTWARE_STACK=1 -DESMF_NETCDF=1 -DESMF_YAMLCPP=1 -DESMF_YAML=1 -DESMF_PIO=1 -DESMF_NO_OPENMP -DESMF_NO_OPENACC -DESMF_TESTEXHAUSTIVE -DESMF_BOPT_g -DESMF_TESTPERFORMANCE -DESMF_TESTCOMPTUNNEL -DS64=1 -DESMF_OS_Darwin=1 -DESMF_COMM=openmpi -DESMF_DIR=/Users/sacks/esmf/esmf0 -D__SDIR__='"src/Infrastructure/Mesh/tests"' -DESMF_CSTD=99 ESMC_MeshUTest.c
ESMC_MeshUTest.c:384:78: error: too few arguments to function call, expected 6, have 5
  localrc = ESMC_MeshAddNodes(mesh, numNodes, nodeIds, nodeCoords, nodeOwners);
            ~~~~~~~~~~~~~~~~~                                                ^
/Users/sacks/esmf/esmf0/src/include/ESMC_Mesh.h:211:5: note: 'ESMC_MeshAddNodes' declared here
int ESMC_MeshAddNodes(
    ^
1 error generated.
gmake[6]: *** [/Users/sacks/esmf/esmf0/build/common.mk:4024: ESMC_MeshUTest.o] Error 1

And ESMPy testing repeatably fails in the 4-processor test with the attached output.

esmpy8.7.0b9-petx4.test.txt

@rokuingh you said that testing is working for you, so I wanted to check if you either (a) have an uncommitted fix for these issues, or (b) have a quick sense of how to fix them. If not, I can dig into it.

Also: one time I ran the ESMPy testing, I got a failure in the single-processor run ("1 failed"), but I couldn't reproduce it (from 8 subsequent runs) and I didn't save the error message. From what I remember seeing at a glance, it looked like it might have been a failure in a regridding test (exceeding a tolerance) that I fixed on develop a few months ago... but I didn't look closely so might have been seeing this wrong, or maybe I accidentally ran the tests from the wrong version of the branch (e.g., before updating to latest develop). So I'm noting this here in case it crops up again, but if I don't see it again, I'll let it go.

billsacks commented 2 months ago

The latest commit from @rokuingh fixes the ESMF testing, but the ESMPy testing is still failing for me as noted above.

billsacks commented 2 months ago

Based on an email exchange with @rokuingh , I will fix the ESMPy testing issues. It seems like the main issue is that the new regridding tests only work in serial, so I will add that limitation. While looking into that, though, I began to feel like we could reduce the number of new tests in this PR and still keep good coverage of the new features. I'm going to try to do that: While both @rokuingh and I both really (hugely!) appreciate the addition of good test coverage of these new features, I want to try to avoid having too much of a disproportionate number of tests for this, and I particularly want to avoid having tests with random inputs if possible (since they can be hard to diagnose if they fail).

billsacks commented 2 months ago

I also thought I'd try to move the new test functions into their respective test classes, but it looks like that doesn't work with parametrized functions, which I guess is why they were written outside of the classes to begin with.

billsacks commented 2 months ago

My original thinking was to reduce the regridding tests considerably, to end up with only 2 regidding tests:

But @rokuingh pointed out that it can be good to have some extra test coverage of regridding via ESMPy, so supported keeping more tests in place. So I'm doing a more limited reduction:

I considered also changing test_regrid_with_node_mask to use a single mask... I tried creating a mask that felt like a hybrid of the existing masks, [[1, 0, 0, 1], [0, 1, 1, 0], [1, 1, 0, 0]], but bilinear and patch remapping failed with that mask. This made me think that I don't understand this well enough to choose a single mask for this test, so I left this parametrized test as is.

billsacks commented 2 months ago

The test changes mentioned above are in https://github.com/esmf-org/esmf/pull/204/commits/3fbcbd9a3bde0d46054d82bd42f69ec092002352 and https://github.com/esmf-org/esmf/pull/204/commits/8bfcbee121aba59ed4a159ee29a5df1ae7f61095. @mcflugen if you feel that I removed something important by doing this, please let me know. I plan to merge this shortly, but can bring back these removed tests if it seems important.

Thank you very much for this very useful contribution @mcflugen ! We really appreciate your work on this!