NOAA-OWP / ngen-forcing

Other
3 stars 6 forks source link

Use of numpy float128 type breaks Arm compatibility #17

Open robertbartel opened 1 month ago

robertbartel commented 1 month ago

Parts of the Python code use a problematic numpy dtype (numpy.float128). The primary issue is that this simply isn't available on ARM Macs (while it hasn't been tested, I strongly suspect this will apply to Arm Linux machines also).

Secondarily, numpy.float128 actually is an alias to numpy.longdouble. While available, this doesn't work as a 128 bit float on Arm Macs, making it a less clear what the right fix is.

Note that all this assumes that Arm-based Macs are considered to be a supported platform. If that is not the case, then the appropriate resolution may just be to more clearly document that this is the case.

Current behavior

For reference, see here, and also here for numpy 1.x documentation.

On MacOS for Arm, numpy.float128 is not available, resulting in errors like this when something such as the ESMF Mesh translator code is run:

Traceback (most recent call last):
  File "/Users/rbartel/Developer/noaa/ngen-forcing/ESMF_Mesh_Domain_Configuration_Production/NextGen_hyfab_to_ESMF_Mesh.py", line 267, in <module>
    main(args)
  File "/Users/rbartel/Developer/noaa/ngen-forcing/ESMF_Mesh_Domain_Configuration_Production/NextGen_hyfab_to_ESMF_Mesh.py", line 118, in main
    node_x_coord = np.empty(total_num_nodes,dtype=np.float128)
                                                  ^^^^^^^^^^^
  File "/Users/rbartel/Developer/noaa/ngen-forcing/venv_esmf_trans/lib/python3.11/site-packages/numpy/__init__.py", line 333, in __getattr__
    raise AttributeError("module {!r} has no attribute "
AttributeError: module 'numpy' has no attribute 'float128'. Did you mean: 'float16'?

Expected behavior

This is not exactly clear. The two obvious choices are to change these usages either to numpy.double or numpy.longdouble.

Taken at face value, one would expect the involved arrays to have a 128 bit float dtype. But that appears to not be possible on Arm Macs. As described here, the long double data type on Arm Macs behaves identically to the double data type. Numpy also mentions that numpy.longdouble isn't necessarily quad-precision.

Regardless, the precise requirements here for the software are not immediately obvious. More assessment and discussion is going to be needed.

jduckerOWP commented 2 weeks ago

@robertbartel I appreciate your support on finding this issue for MacOS compatibility. Testing was just completed on the latest hydrofabric v2.2. CONUS geopackage and we've ensured the changes from np.float64 and np.float128 to np.double reflected the correct precision for element centroids to remain unique. Please go ahead when you get a chance and retest that script on your end for a Mac-OS environment to ensure you get an ESMF mesh file produced from your hydrofabric geopackage file you were testing. If testing is successful on your end, then we'll go ahead and close this issue.

jduckerOWP commented 1 week ago

@robertbartel I've addressed the issue with the script here as well with the parquet file argument (which isn't required), so this script should run just fine for you now at least with this stage of testing.